1. 27 May, 2025 1 commit
    • Linoy Tsaban's avatar
      [Sana Sprint] add image-to-image pipeline (#11602) · 28ef0165
      Linoy Tsaban authored
      
      
      * sana sprint img2img
      
      * fix import
      
      * fix name
      
      * fix image encoding
      
      * fix image encoding
      
      * fix image encoding
      
      * fix image encoding
      
      * fix image encoding
      
      * fix image encoding
      
      * try w/o strength
      
      * try scaling differently
      
      * try with strength
      
      * revert unnecessary changes to scheduler
      
      * revert unnecessary changes to scheduler
      
      * Apply style fixes
      
      * remove comment
      
      * add copy statements
      
      * add copy statements
      
      * add to doc
      
      * add to doc
      
      * add to doc
      
      * add to doc
      
      * Apply style fixes
      
      * empty commit
      
      * fix copies
      
      * fix copies
      
      * fix copies
      
      * fix copies
      
      * fix copies
      
      * docs
      
      * make fix-copies.
      
      * fix doc building error.
      
      * initial commit - add img2img test
      
      * initial commit - add img2img test
      
      * fix import
      
      * fix imports
      
      * Apply style fixes
      
      * empty commit
      
      * remove
      
      * empty commit
      
      * test vocab size
      
      * fix
      
      * fix prompt missing from last commits
      
      * small changes
      
      * fix image processing when input is tensor
      
      * fix order
      
      * Apply style fixes
      
      * empty commit
      
      * fix shape
      
      * remove comment
      
      * image processing
      
      * remove comment
      
      * skip vae tiling test for now
      
      * Apply style fixes
      
      * empty commit
      
      ---------
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      28ef0165
  2. 19 May, 2025 1 commit
  3. 24 Apr, 2025 1 commit
  4. 13 Apr, 2025 1 commit
    • Ishan Modi's avatar
      [ControlNet] Adds controlnet for SanaTransformer (#11040) · f1f38ffb
      Ishan Modi authored
      
      
      * added controlnet for sana transformer
      
      * improve code quality
      
      * addressed PR comments
      
      * bug fixes
      
      * added test cases
      
      * update
      
      * added dummy objects
      
      * addressed PR comments
      
      * update
      
      * Forcing update
      
      * add to docs
      
      * code quality
      
      * addressed PR comments
      
      * addressed PR comments
      
      * update
      
      * addressed PR comments
      
      * added proper styling
      
      * update
      
      * Revert "added proper styling"
      
      This reverts commit 344ee8a7014ada095b295034ef84341f03b0e359.
      
      * manually ordered
      
      * Apply suggestions from code review
      
      ---------
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      f1f38ffb
  5. 25 Mar, 2025 1 commit
  6. 21 Mar, 2025 1 commit
  7. 22 Feb, 2025 1 commit
    • Daniel Regado's avatar
      Comprehensive type checking for `from_pretrained` kwargs (#10758) · 9c7e2051
      Daniel Regado authored
      
      
      * More robust from_pretrained init_kwargs type checking
      
      * Corrected for Python 3.10
      
      * Type checks subclasses and fixed type warnings
      
      * More type corrections and skip tokenizer type checking
      
      * make style && make quality
      
      * Updated docs and types for Lumina pipelines
      
      * Fixed check for empty signature
      
      * changed location of helper functions
      
      * make style
      
      ---------
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      9c7e2051
  8. 20 Feb, 2025 1 commit
    • Sayak Paul's avatar
      [tests] test `encode_prompt()` in isolation (#10438) · b2ca39c8
      Sayak Paul authored
      * poc encode_prompt() tests
      
      * fix
      
      * updates.
      
      * fixes
      
      * fixes
      
      * updates
      
      * updates
      
      * updates
      
      * revert
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * remove SDXLOptionalComponentsTesterMixin.
      
      * remove tests that directly leveraged encode_prompt() in some way or the other.
      
      * fix imports.
      
      * remove _save_load
      
      * fixes
      
      * fixes
      
      * fixes
      
      * fixes
      b2ca39c8
  9. 14 Jan, 2025 1 commit
    • Junsong Chen's avatar
      [Sana-4K] (#10537) · 3d707773
      Junsong Chen authored
      
      
      * [Sana 4K]
      add 4K support for Sana
      
      * [Sana-4K] fix SanaPAGPipeline
      
      * add VAE automatically tiling function;
      
      * set clean_caption to False;
      
      * add warnings for VAE OOM.
      
      * style
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      3d707773
  10. 11 Jan, 2025 1 commit
    • Junyu Chen's avatar
      [DC-AE] support tiling for DC-AE (#10510) · e7db062e
      Junyu Chen authored
      
      
      * autoencoder_dc tiling
      
      * add tiling and slicing support in SANA pipelines
      
      * create variables for padding length because the line becomes too long
      
      * add tiling and slicing support in pag SANA pipelines
      
      * revert changes to tile size
      
      * make style
      
      * add vae tiling test
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      e7db062e
  11. 08 Jan, 2025 2 commits
  12. 27 Dec, 2024 1 commit
  13. 23 Dec, 2024 1 commit
  14. 18 Dec, 2024 1 commit
    • Sayak Paul's avatar
      [LoRA] feat: lora support for SANA. (#10234) · 9408aa2d
      Sayak Paul authored
      
      
      * feat: lora support for SANA.
      
      * make fix-copies
      
      * rename test class.
      
      * attention_kwargs -> cross_attention_kwargs.
      
      * Revert "attention_kwargs -> cross_attention_kwargs."
      
      This reverts commit 23433bf9bccc12e0f2f55df26bae58a894e8b43b.
      
      * exhaust 119 max line limit
      
      * sana lora fine-tuning script.
      
      * readme
      
      * add a note about the supported models.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      
      * style
      
      * docs for attention_kwargs.
      
      * remove lora_scale from pag pipeline.
      
      * copy fix
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      9408aa2d
  15. 15 Dec, 2024 1 commit
    • Junsong Chen's avatar
      [Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`,... · 5a196e3d
      Junsong Chen authored
      
      [Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. (#9982)
      
      * first add a script for DC-AE;
      
      * DC-AE init
      
      * replace triton with custom implementation
      
      * 1. rename file and remove un-used codes;
      
      * no longer rely on omegaconf and dataclass
      
      * replace custom activation with diffuers activation
      
      * remove dc_ae attention in attention_processor.py
      
      * iinherit from ModelMixin
      
      * inherit from ConfigMixin
      
      * dc-ae reduce to one file
      
      * update downsample and upsample
      
      * clean code
      
      * support DecoderOutput
      
      * remove get_same_padding and val2tuple
      
      * remove autocast and some assert
      
      * update ResBlock
      
      * remove contents within super().__init__
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove opsequential
      
      * update other blocks to support the removal of build_norm
      
      * remove build encoder/decoder project in/out
      
      * remove inheritance of RMSNorm2d from LayerNorm
      
      * remove reset_parameters for RMSNorm2d
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove device and dtype in RMSNorm2d __init__
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove op_list & build_block
      
      * remove build_stage_main
      
      * change file name to autoencoder_dc
      
      * move LiteMLA to attention.py
      
      * align with other vae decode output;
      
      * add DC-AE into init files;
      
      * update
      
      * make quality && make style;
      
      * quick push before dgx disappears again
      
      * update
      
      * make style
      
      * update
      
      * update
      
      * fix
      
      * refactor
      
      * refactor
      
      * refactor
      
      * update
      
      * possibly change to nn.Linear
      
      * refactor
      
      * make fix-copies
      
      * replace vae with ae
      
      * replace get_block_from_block_type to get_block
      
      * replace downsample_block_type from Conv to conv for consistency
      
      * add scaling factors
      
      * incorporate changes for all checkpoints
      
      * make style
      
      * move mla to attention processor file; split qkv conv to linears
      
      * refactor
      
      * add tests
      
      * from original file loader
      
      * add docs
      
      * add standard autoencoder methods
      
      * combine attention processor
      
      * fix tests
      
      * update
      
      * minor fix
      
      * minor fix
      
      * minor fix & in/out shortcut rename
      
      * minor fix
      
      * make style
      
      * fix paper link
      
      * update docs
      
      * update single file loading
      
      * make style
      
      * remove single file loading support; todo for DN6
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * add abstract
      
      * 1. add DCAE into diffusers;
      2. make style and make quality;
      
      * add DCAE_HF into diffusers;
      
      * bug fixed;
      
      * add SanaPipeline, SanaTransformer2D into diffusers;
      
      * add sanaLinearAttnProcessor2_0;
      
      * first update for SanaTransformer;
      
      * first update for SanaPipeline;
      
      * first success run SanaPipeline;
      
      * model output finally match with original model with the same intput;
      
      * code update;
      
      * code update;
      
      * add a flow dpm-solver scripts
      
      * 🎉[important update]
      1. Integrate flow-dpm-sovler into diffusers;
      2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;
      
      * 🎉🔧
      
      [important update & fix huge bugs!!]
      1. add SanaPAGPipeline & several related Sana linear attention operators;
      2. `SanaTransformer2DModel` not supports multi-resolution input;
      2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
      3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;
      
      * remove prints;
      
      * add convert sana official checkpoint to diffusers format Safetensor.
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/pag/pipeline_pag_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/sana/pipeline_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/sana/pipeline_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * update Sana for DC-AE's recent commit;
      
      * make style && make quality
      
      * Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)
      
      * fix progress bar updates in SD 1.5 PAG Img2Img pipeline
      
      ---------
      Co-authored-by: default avatarVinh H. Pham <phamvinh257@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * make the vae can be None in `__init__` of `SanaPipeline`
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * change the ae related code due to the latest update of DCAE branch;
      
      * change the ae related code due to the latest update of DCAE branch;
      
      * 1. change code based on AutoencoderDC;
      2. fix the bug of new GLUMBConv;
      3. run success;
      
      * update for solving conversation.
      
      * 1. fix bugs and run convert script success;
      2. Downloading ckpt from hub automatically;
      
      * make style && make quality;
      
      * 1. remove un-unsed parameters in init;
      2. code update;
      
      * remove test file
      
      * refactor; add docs; add tests; update conversion script
      
      * make style
      
      * make fix-copies
      
      * refactor
      
      * udpate pipelines
      
      * pag tests and refactor
      
      * remove sana pag conversion script
      
      * handle weight casting in conversion script
      
      * update conversion script
      
      * add a processor
      
      * 1. add bf16 pth file path;
      2. add complex human instruct in pipeline;
      
      * fix fast \tests
      
      * change gemma-2-2b-it ckpt to a non-gated repo;
      
      * fix the pth path bug in conversion script;
      
      * change grad ckpt to original; make style
      
      * fix the complex_human_instruct bug and typo;
      
      * remove dpmsolver flow scheduler
      
      * apply review suggestions
      
      * change the `FlowMatchEulerDiscreteScheduler` to default `DPMSolverMultistepScheduler` with flow matching scheduler.
      
      * fix the tokenizer.padding_side='right' bug;
      
      * update docs
      
      * make fix-copies
      
      * fix imports
      
      * fix docs
      
      * add integration test
      
      * update docs
      
      * update examples
      
      * fix convert_model_output in schedulers
      
      * fix failing tests
      
      ---------
      Co-authored-by: default avatarJunyu Chen <chenjydl2003@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarchenjy2003 <70215701+chenjy2003@users.noreply.github.com>
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      5a196e3d