1. 12 May, 2025 1 commit
    • Zhong-Yu Li's avatar
      Add VisualCloze (#11377) · 4f438de3
      Zhong-Yu Li authored
      * VisualCloze
      
      * style quality
      
      * add docs
      
      * add docs
      
      * typo
      
      * Update docs/source/en/api/pipelines/visualcloze.md
      
      * delete einops
      
      * style quality
      
      * Update src/diffusers/pipelines/visualcloze/pipeline_visualcloze.py
      
      * reorg
      
      * refine doc
      
      * style quality
      
      * typo
      
      * typo
      
      * Update src/diffusers/image_processor.py
      
      * add comment
      
      * test
      
      * style
      
      * Modified based on review
      
      * style
      
      * restore image_processor
      
      * update example url
      
      * style
      
      * fix-copies
      
      * VisualClozeGenerationPipeline
      
      * combine
      
      * tests docs
      
      * remove VisualClozeUpsamplingPipeline
      
      * style
      
      * quality
      
      * test examples
      
      * quality style
      
      * typo
      
      * make fix-copies
      
      * fix test_callback_cfg and test_save_load_dduf in VisualClozePipelineFastTests
      
      * add EXAMPLE_DOC_STRING to VisualClozeGenerationPipeline
      
      * delete maybe_free_model_hooks from pipeline_visualcloze_combined
      
      * Apply suggestions from code review
      
      * fix test_save_load_local test; add reason for skipping cfg test
      
      * more save_load test fixes
      
      * fix tests in generation pipeline tests
      4f438de3
  2. 11 Feb, 2025 1 commit
  3. 11 Jul, 2024 1 commit
    • Xin Ma's avatar
      Latte: Latent Diffusion Transformer for Video Generation (#8404) · b8cf84a3
      Xin Ma authored
      
      
      * add Latte to diffusers
      
      * remove print
      
      * remove print
      
      * remove print
      
      * remove unuse codes
      
      * remove layer_norm_latte and add a flag
      
      * remove layer_norm_latte and add a flag
      
      * update latte_pipeline
      
      * update latte_pipeline
      
      * remove unuse squeeze
      
      * add norm_hidden_states.ndim == 2: # for Latte
      
      * fixed test latte pipeline bugs
      
      * fixed test latte pipeline bugs
      
      * delete sh
      
      * add doc for latte
      
      * add licensing
      
      * Move Transformer3DModelOutput to modeling_outputs
      
      * give a default value to sample_size
      
      * remove the einops dependency
      
      * change norm2 for latte
      
      * modify pipeline of latte
      
      * update test for Latte
      
      * modify some codes for latte
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * modify for Latte pipeline
      
      * video_length -> num_frames; update prepare_latents copied from
      
      * make fix-copies
      
      * make style
      
      * typo: videe -> video
      
      * update
      
      * modify for Latte pipeline
      
      * modify latte pipeline
      
      * modify latte pipeline
      
      * modify latte pipeline
      
      * modify latte pipeline
      
      * modify for Latte pipeline
      
      * Delete .vscode directory
      
      * make style
      
      * make fix-copies
      
      * add latte transformer 3d to docs _toctree.yml
      
      * update example
      
      * reduce frames for test
      
      * fixed bug of _text_preprocessing
      
      * set num frame to 1 for testing
      
      * remove unuse print
      
      * add text = self._clean_caption(text) again
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      b8cf84a3
  4. 01 Jun, 2024 1 commit
  5. 31 Jan, 2024 1 commit
  6. 07 Nov, 2023 1 commit
  7. 06 Nov, 2023 1 commit
    • Sayak Paul's avatar
      [Feat] PixArt-Alpha (#5642) · d61889fc
      Sayak Paul authored
      
      
      * init pixart alpha pipeline
      
      * fix: import
      
      * script
      
      * script
      
      * script
      
      * add: vae to the pipeline
      
      * add: vae_scale_factor
      
      * add: checkpoint_path
      
      * clean conversion script a bit.
      
      * size embeddings.
      
      * fix: size embedding
      
      * update scrip
      
      * support for interpolation of position embedding.
      
      * support for conditioning.
      
      * ..
      
      * ..
      
      * ..
      
      * final layer
      
      * final layer
      
      * align if encode_prompt
      
      * support for caption embedding
      
      * refactor
      
      * refactor
      
      * refactor
      
      * start cross attention
      
      * start cross attention
      
      * cross_attention_dim
      
      * cross
      
      * cross
      
      * support for resolution and aspect_ratio
      
      * support for caption projection
      
      * refactor patch embeddings
      
      * batch_size
      
      * up
      
      * commit
      
      * commit
      
      * commit.
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze
      
      * squeeze.
      
      * squeeze.
      
      * fix final block./
      
      * fix final block./
      
      * fix final block./
      
      * clean
      
      * fix: interpolation scale.
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging'
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * make --checkpoint_path non-required.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * remove num_tokens
      
      * timesteps -> timestep
      
      * timesteps -> timestep
      
      * timesteps -> timestep
      
      * timesteps -> timestep
      
      * timesteps -> timestep
      
      * timesteps -> timestep
      
      * debug
      
      * debug
      
      * update conversion script.
      
      * update conversion script.
      
      * update conversion script.
      
      * debug
      
      * debug
      
      * debug
      
      * clean
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * deug
      
      * debug
      
      * debug
      
      * debug
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * clean
      
      * fix
      
      * fix
      
      * boom
      
      * boom
      
      * some changes
      
      * boom
      
      * save
      
      * up
      
      * remove i
      
      * fix more tests
      
      * DPMSolverMultistepScheduler
      
      * fix
      
      * offloading
      
      * fix conversion script
      
      * fix conversion script
      
      * remove print
      
      * remove support for negative prompt embeds.
      
      * typo.
      
      * remove extra kwargs
      
      * bring conversion script to where it was
      
      * fix
      
      * trying mu luck
      
      * trying my luck again
      
      * again
      
      * again
      
      * again
      
      * clean up
      
      * up
      
      * up
      
      * update example
      
      * support for 512
      
      * remove spacing
      
      * finalize docs.
      
      * test debug
      
      * fix: assertion values.
      
      * debug
      
      * debug
      
      * debug
      
      * fix: repeat
      
      * remove prints.
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Correct more
      
      * Apply suggestions from code review
      
      * Change all
      
      * Clean more
      
      * fix more
      
      * Fix more
      
      * Fix more
      
      * Correct more
      
      * address patrick's comments.
      
      * remove unneeded args
      
      * clean up pipeline.
      
      * sty;e
      
      * make the use of additional conditions better conditioned.
      
      * None better
      
      * dtype
      
      * height and width validation
      
      * add a note about size brackets.
      
      * fix
      
      * spit out slow test outputs.
      
      * fix?
      
      * fix optional test
      
      * fix more
      
      * remove unneeded comment
      
      * debug
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d61889fc