Commits · 28ef0165b91948a8ba9fabc8e993eb11f4c78a58 · renzhc / diffusers_dcu

27 May, 2025 1 commit

[Sana Sprint] add image-to-image pipeline (#11602) · 28ef0165

Linoy Tsaban authored May 27, 2025



* sana sprint img2img

* fix import

* fix name

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* try w/o strength

* try scaling differently

* try with strength

* revert unnecessary changes to scheduler

* revert unnecessary changes to scheduler

* Apply style fixes

* remove comment

* add copy statements

* add copy statements

* add to doc

* add to doc

* add to doc

* add to doc

* Apply style fixes

* empty commit

* fix copies

* fix copies

* fix copies

* fix copies

* fix copies

* docs

* make fix-copies.

* fix doc building error.

* initial commit - add img2img test

* initial commit - add img2img test

* fix import

* fix imports

* Apply style fixes

* empty commit

* remove

* empty commit

* test vocab size

* fix

* fix prompt missing from last commits

* small changes

* fix image processing when input is tensor

* fix order

* Apply style fixes

* empty commit

* fix shape

* remove comment

* image processing

* remove comment

* skip vae tiling test for now

* Apply style fixes

* empty commit

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>

28ef0165

19 May, 2025 1 commit

Use HF Papers (#11567) · c8bb1ff5

Quentin Gallouédec authored May 19, 2025



* Use HF Papers

* Apply style fixes

---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

c8bb1ff5

24 Apr, 2025 1 commit
- Fix typos in strings and comments (#11407) · f00a9957
  co63oc authored Apr 25, 2025
  
  f00a9957
13 Apr, 2025 1 commit

[ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb

Ishan Modi authored Apr 13, 2025



* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.

* manually ordered

* Apply suggestions from code review

---------
Co-authored-by: Aryan <contact.aryanvs@gmail.com>

f1f38ffb

25 Mar, 2025 1 commit
- add a timestep scale for sana-sprint teacher model (#11150) · 739d6ec7
  Junsong Chen authored Mar 26, 2025
  
  739d6ec7
21 Mar, 2025 1 commit

add sana-sprint (#11074) · 8a63aa5e

YiYi Xu authored Mar 21, 2025



* add sana-sprint




---------
Co-authored-by: Junsong Chen <cjs1020440147@icloud.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

8a63aa5e

22 Feb, 2025 1 commit

Comprehensive type checking for `from_pretrained` kwargs (#10758) · 9c7e2051

Daniel Regado authored Feb 22, 2025



* More robust from_pretrained init_kwargs type checking

* Corrected for Python 3.10

* Type checks subclasses and fixed type warnings

* More type corrections and skip tokenizer type checking

* make style && make quality

* Updated docs and types for Lumina pipelines

* Fixed check for empty signature

* changed location of helper functions

* make style

---------
Co-authored-by: hlky <hlky@hlky.ac>

9c7e2051

20 Feb, 2025 1 commit

[tests] test `encode_prompt()` in isolation (#10438) · b2ca39c8

Sayak Paul authored Feb 20, 2025

* poc encode_prompt() tests

* fix

* updates.

* fixes

* fixes

* updates

* updates

* updates

* revert

* updates

* updates

* updates

* updates

* remove SDXLOptionalComponentsTesterMixin.

* remove tests that directly leveraged encode_prompt() in some way or the other.

* fix imports.

* remove _save_load

* fixes

* fixes

* fixes

* fixes

b2ca39c8

14 Jan, 2025 1 commit

[Sana-4K] (#10537) · 3d707773

Junsong Chen authored Jan 15, 2025



* [Sana 4K]
add 4K support for Sana

* [Sana-4K] fix SanaPAGPipeline

* add VAE automatically tiling function;

* set clean_caption to False;

* add warnings for VAE OOM.

* style

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>

3d707773

11 Jan, 2025 1 commit

[DC-AE] support tiling for DC-AE (#10510) · e7db062e

Junyu Chen authored Jan 11, 2025



* autoencoder_dc tiling

* add tiling and slicing support in SANA pipelines

* create variables for padding length because the line becomes too long

* add tiling and slicing support in pag SANA pipelines

* revert changes to tile size

* make style

* add vae tiling test

---------
Co-authored-by: Aryan <aryan@huggingface.co>

e7db062e

08 Jan, 2025 2 commits
- [Sana 4K] (#10493) · c0964571
  Junsong Chen authored Jan 09, 2025
```
add 4K support for Sana
```
  c0964571
- [Sana][bug fix]change clean_caption from True to False. (#10481) · 80fd9260
  Junsong Chen authored Jan 08, 2025
```
change clean_caption from True to False.
```
  80fd9260
27 Dec, 2024 1 commit
- [Add] torch_xla support to pipeline_sana.py (#10364) · 83da817f
  SahilCarterr authored Dec 27, 2024
```
[Add] torch_xla support in pipeline_sana.py
```
  83da817f
23 Dec, 2024 1 commit

[Sana bug] bug fix for 2K model config (#10340) · b58868e6

Junsong Chen authored Dec 23, 2024



* fix the Positinoal Embedding bug in 2K model;

* Change the default model to the BF16 one for more stable training and output

* make style

* substract buffer size

* add compute_module_persistent_sizes

---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>

b58868e6

18 Dec, 2024 1 commit

[LoRA] feat: lora support for SANA. (#10234) · 9408aa2d

Sayak Paul authored Dec 18, 2024



* feat: lora support for SANA.

* make fix-copies

* rename test class.

* attention_kwargs -> cross_attention_kwargs.

* Revert "attention_kwargs -> cross_attention_kwargs."

This reverts commit 23433bf9bccc12e0f2f55df26bae58a894e8b43b.

* exhaust 119 max line limit

* sana lora fine-tuning script.

* readme

* add a note about the supported models.

* Apply suggestions from code review
Co-authored-by: Aryan <aryan@huggingface.co>

* style

* docs for attention_kwargs.

* remove lora_scale from pag pipeline.

* copy fix

---------
Co-authored-by: Aryan <aryan@huggingface.co>

9408aa2d

15 Dec, 2024 1 commit

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`,... · 5a196e3d

Junsong Chen authored Dec 16, 2024


[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. (#9982)

* first add a script for DC-AE;

* DC-AE init

* replace triton with custom implementation

* 1. rename file and remove un-used codes;

* no longer rely on omegaconf and dataclass

* replace custom activation with diffuers activation

* remove dc_ae attention in attention_processor.py

* iinherit from ModelMixin

* inherit from ConfigMixin

* dc-ae reduce to one file

* update downsample and upsample

* clean code

* support DecoderOutput

* remove get_same_padding and val2tuple

* remove autocast and some assert

* update ResBlock

* remove contents within super().__init__

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove opsequential

* update other blocks to support the removal of build_norm

* remove build encoder/decoder project in/out

* remove inheritance of RMSNorm2d from LayerNorm

* remove reset_parameters for RMSNorm2d
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove device and dtype in RMSNorm2d __init__
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove op_list & build_block

* remove build_stage_main

* change file name to autoencoder_dc

* move LiteMLA to attention.py

* align with other vae decode output;

* add DC-AE into init files;

* update

* make quality && make style;

* quick push before dgx disappears again

* update

* make style

* update

* update

* fix

* refactor

* refactor

* refactor

* update

* possibly change to nn.Linear

* refactor

* make fix-copies

* replace vae with ae

* replace get_block_from_block_type to get_block

* replace downsample_block_type from Conv to conv for consistency

* add scaling factors

* incorporate changes for all checkpoints

* make style

* move mla to attention processor file; split qkv conv to linears

* refactor

* add tests

* from original file loader

* add docs

* add standard autoencoder methods

* combine attention processor

* fix tests

* update

* minor fix

* minor fix

* minor fix & in/out shortcut rename

* minor fix

* make style

* fix paper link

* update docs

* update single file loading

* make style

* remove single file loading support; todo for DN6

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add abstract

* 1. add DCAE into diffusers;
2. make style and make quality;

* add DCAE_HF into diffusers;

* bug fixed;

* add SanaPipeline, SanaTransformer2D into diffusers;

* add sanaLinearAttnProcessor2_0;

* first update for SanaTransformer;

* first update for SanaPipeline;

* first success run SanaPipeline;

* model output finally match with original model with the same intput;

* code update;

* code update;

* add a flow dpm-solver scripts

* 🎉[important update]
1. Integrate flow-dpm-sovler into diffusers;
2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;

* 🎉🔧

[important update & fix huge bugs!!]
1. add SanaPAGPipeline & several related Sana linear attention operators;
2. `SanaTransformer2DModel` not supports multi-resolution input;
2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;

* remove prints;

* add convert sana official checkpoint to diffusers format Safetensor.

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/pag/pipeline_pag_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update Sana for DC-AE's recent commit;

* make style && make quality

* Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)

* fix progress bar updates in SD 1.5 PAG Img2Img pipeline

---------
Co-authored-by: Vinh H. Pham <phamvinh257@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make the vae can be None in `__init__` of `SanaPipeline`

* Update src/diffusers/models/transformers/sana_transformer_2d.py
Co-authored-by: hlky <hlky@hlky.ac>

* change the ae related code due to the latest update of DCAE branch;

* change the ae related code due to the latest update of DCAE branch;

* 1. change code based on AutoencoderDC;
2. fix the bug of new GLUMBConv;
3. run success;

* update for solving conversation.

* 1. fix bugs and run convert script success;
2. Downloading ckpt from hub automatically;

* make style && make quality;

* 1. remove un-unsed parameters in init;
2. code update;

* remove test file

* refactor; add docs; add tests; update conversion script

* make style

* make fix-copies

* refactor

* udpate pipelines

* pag tests and refactor

* remove sana pag conversion script

* handle weight casting in conversion script

* update conversion script

* add a processor

* 1. add bf16 pth file path;
2. add complex human instruct in pipeline;

* fix fast \tests

* change gemma-2-2b-it ckpt to a non-gated repo;

* fix the pth path bug in conversion script;

* change grad ckpt to original; make style

* fix the complex_human_instruct bug and typo;

* remove dpmsolver flow scheduler

* apply review suggestions

* change the `FlowMatchEulerDiscreteScheduler` to default `DPMSolverMultistepScheduler` with flow matching scheduler.

* fix the tokenizer.padding_side='right' bug;

* update docs

* make fix-copies

* fix imports

* fix docs

* add integration test

* update docs

* update examples

* fix convert_model_output in schedulers

* fix failing tests

---------
Co-authored-by: Junyu Chen <chenjydl2003@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: chenjy2003 <70215701+chenjy2003@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>

5a196e3d