1. 05 Dec, 2024 1 commit
  2. 03 Dec, 2024 1 commit
  3. 18 Nov, 2024 1 commit
  4. 05 Nov, 2024 1 commit
    • Aryan's avatar
      [core] Mochi T2V (#9769) · 3f329a42
      Aryan authored
      
      
      * update
      
      * udpate
      
      * update transformer
      
      * make style
      
      * fix
      
      * add conversion script
      
      * update
      
      * fix
      
      * update
      
      * fix
      
      * update
      
      * fixes
      
      * make style
      
      * update
      
      * update
      
      * update
      
      * init
      
      * update
      
      * update
      
      * add
      
      * up
      
      * up
      
      * up
      
      * update
      
      * mochi transformer
      
      * remove original implementation
      
      * make style
      
      * update inits
      
      * update conversion script
      
      * docs
      
      * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * fix docs
      
      * pipeline fixes
      
      * make style
      
      * invert sigmas in scheduler; fix pipeline
      
      * fix pipeline num_frames
      
      * flip proj and gate in swiglu
      
      * make style
      
      * fix
      
      * make style
      
      * fix tests
      
      * latent mean and std fix
      
      * update
      
      * cherry-pick 1069d210e1b9e84a366cdc7a13965626ea258178
      
      * remove additional sigma already handled by flow match scheduler
      
      * fix
      
      * remove hardcoded value
      
      * replace conv1x1 with linear
      
      * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * framewise decoding and conv_cache
      
      * make style
      
      * Apply suggestions from code review
      
      * mochi vae encoder changes
      
      * rebase correctly
      
      * Update scripts/convert_mochi_to_diffusers.py
      
      * fix tests
      
      * fixes
      
      * make style
      
      * update
      
      * make style
      
      * update
      
      * add framewise and tiled encoding
      
      * make style
      
      * make original vae implementation behaviour the default; note: framewise encoding does not work
      
      * remove framewise encoding implementation due to presence of attn layers
      
      * fight test 1
      
      * fight test 2
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      3f329a42
  5. 29 Oct, 2024 1 commit
  6. 14 Oct, 2024 1 commit
    • Yuxuan.Zhang's avatar
      CogView3Plus DiT (#9570) · 8d81564b
      Yuxuan.Zhang authored
      * merge 9588
      
      * max_shard_size="5GB" for colab running
      
      * conversion script updates; modeling test; refactor transformer
      
      * make fix-copies
      
      * Update convert_cogview3_to_diffusers.py
      
      * initial pipeline draft
      
      * make style
      
      * fight bugs 🐛
      
      🪳
      
      * add example
      
      * add tests; refactor
      
      * make style
      
      * make fix-copies
      
      * add co-author
      
      YiYi Xu <yixu310@gmail.com>
      
      * remove files
      
      * add docs
      
      * add co-author
      Co-Authored-By: default avatarYiYi Xu <yixu310@gmail.com>
      
      * fight docs
      
      * address reviews
      
      * make style
      
      * make model work
      
      * remove qkv fusion
      
      * remove qkv fusion tets
      
      * address review comments
      
      * fix make fix-copies error
      
      * remove None and TODO
      
      * for FP16(draft)
      
      * make style
      
      * remove dynamic cfg
      
      * remove pooled_projection_dim as a parameter
      
      * fix tests
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      8d81564b
  7. 16 Sep, 2024 1 commit
    • Yuxuan.Zhang's avatar
      CogVideoX-5b-I2V support (#9418) · 8336405e
      Yuxuan.Zhang authored
      
      
      * draft Init
      
      * draft
      
      * vae encode image
      
      * make style
      
      * image latents preparation
      
      * remove image encoder from conversion script
      
      * fix minor bugs
      
      * make pipeline work
      
      * make style
      
      * remove debug prints
      
      * fix imports
      
      * update example
      
      * make fix-copies
      
      * add fast tests
      
      * fix import
      
      * update vae
      
      * update docs
      
      * update image link
      
      * apply suggestions from review
      
      * apply suggestions from review
      
      * add slow test
      
      * make use of learned positional embeddings
      
      * apply suggestions from review
      
      * doc change
      
      * Update convert_cogvideox_to_diffusers.py
      
      * make style
      
      * final changes
      
      * make style
      
      * fix tests
      
      ---------
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      8336405e
  8. 11 Sep, 2024 1 commit
  9. 03 Sep, 2024 2 commits
  10. 29 Aug, 2024 2 commits
  11. 25 Aug, 2024 1 commit
  12. 23 Aug, 2024 1 commit
  13. 21 Aug, 2024 1 commit
  14. 07 Aug, 2024 1 commit
  15. 03 Aug, 2024 1 commit
  16. 01 Aug, 2024 1 commit
  17. 30 Jul, 2024 1 commit
    • Yoach Lacombe's avatar
      Stable Audio integration (#8716) · 69e72b1d
      Yoach Lacombe authored
      
      
      * WIP modeling code and pipeline
      
      * add custom attention processor + custom activation + add to init
      
      * correct ProjectionModel forward
      
      * add stable audio to __initèè
      
      * add autoencoder and update pipeline and modeling code
      
      * add half Rope
      
      * add partial rotary v2
      
      * add temporary modfis to scheduler
      
      * add EDM DPM Solver
      
      * remove TODOs
      
      * clean GLU
      
      * remove att.group_norm to attn processor
      
      * revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
      
      * refactor GLU -> SwiGLU
      
      * remove redundant args
      
      * add channel multiples in autoencoder docstrings
      
      * changes in docsrtings and copyright headers
      
      * clean pipeline
      
      * further cleaning
      
      * remove peft and lora and fromoriginalmodel
      
      * Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace
      
      * make style
      
      * dummy models
      
      * fix copied from
      
      * add fast oobleck tests
      
      * add brownian tree
      
      * oobleck autoencoder slow tests
      
      * remove TODO
      
      * fast stable audio pipeline tests
      
      * add slow tests
      
      * make style
      
      * add first version of docs
      
      * wrap is_torchsde_available to the scheduler
      
      * fix slow test
      
      * test with input waveform
      
      * add input waveform
      
      * remove some todos
      
      * create stableaudio gaussian projection + make style
      
      * add pipeline to toctree
      
      * fix copied from
      
      * make quality
      
      * refactor timestep_features->time_proj
      
      * refactor joint_attention_kwargs->cross_attention_kwargs
      
      * remove forward_chunk
      
      * move StableAudioDitModel to transformers folder
      
      * correct convert + remove partial rotary embed
      
      * apply suggestions from yiyixuxu -> removing attn.kv_heads
      
      * remove temb
      
      * remove cross_attention_kwargs
      
      * further removal of cross_attention_kwargs
      
      * remove text encoder autocast to fp16
      
      * continue removing autocast
      
      * make style
      
      * refactor how text and audio are embedded
      
      * add paper
      
      * update example code
      
      * make style
      
      * unify projection model forward + fix device placement
      
      * make style
      
      * remove fuse qkv
      
      * apply suggestions from review
      
      * Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * make style
      
      * smaller models in fast tests
      
      * pass sequential offloading fast tests
      
      * add docs for vae and autoencoder
      
      * make style and update example
      
      * remove useless import
      
      * add cosine scheduler
      
      * dummy classes
      
      * cosine scheduler docs
      
      * better description of scheduler
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      69e72b1d
  18. 20 Jul, 2024 1 commit
  19. 11 Jul, 2024 2 commits
  20. 08 Jul, 2024 1 commit
  21. 04 Jul, 2024 1 commit
  22. 02 Jul, 2024 1 commit
  23. 01 Jul, 2024 1 commit
  24. 24 Jun, 2024 1 commit
  25. 12 Jun, 2024 1 commit
  26. 01 Jun, 2024 1 commit
  27. 29 May, 2024 1 commit
  28. 10 May, 2024 1 commit
    • Mark Van Aken's avatar
      #7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b
      Mark Van Aken authored
      * find & replace all FloatTensors to Tensor
      
      * apply formatting
      
      * Update torch.FloatTensor to torch.Tensor in the remaining files
      
      * formatting
      
      * Fix the rest of the places where FloatTensor is used as well as in documentation
      
      * formatting
      
      * Update new file from FloatTensor to Tensor
      be4afa0b
  29. 19 Apr, 2024 1 commit
  30. 02 Apr, 2024 2 commits
  31. 13 Mar, 2024 1 commit
    • Sayak Paul's avatar
      [LoRA] use the PyTorch classes wherever needed and start depcrecation cycles (#7204) · 531e7191
      Sayak Paul authored
      * fix PyTorch classes and start deprecsation cycles.
      
      * remove args crafting for accommodating scale.
      
      * remove scale check in feedforward.
      
      * assert against nn.Linear and not CompatibleLinear.
      
      * remove conv_cls and lineaR_cls.
      
      * remove scale
      
      * 👋
      
       scale.
      
      * fix: unet2dcondition
      
      * fix attention.py
      
      * fix: attention.py again
      
      * fix: unet_2d_blocks.
      
      * fix-copies.
      
      * more fixes.
      
      * fix: resnet.py
      
      * more fixes
      
      * fix i2vgenxl unet.
      
      * depcrecate scale gently.
      
      * fix-copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * quality
      
      * throw warning when scale is passed to the the BasicTransformerBlock class.
      
      * remove scale from signature.
      
      * cross_attention_kwargs, very nice catch by Yiyi
      
      * fix: logger.warn
      
      * make deprecation message clearer.
      
      * address final comments.
      
      * maintain same depcrecation message and also add it to activations.
      
      * address yiyi
      
      * fix copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * more depcrecation
      
      * fix-copies
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      531e7191
  32. 08 Feb, 2024 1 commit
  33. 31 Jan, 2024 1 commit
  34. 28 Dec, 2023 1 commit
  35. 21 Dec, 2023 1 commit
    • Will Berman's avatar
      open muse (#5437) · 40398152
      Will Berman authored
      
      
      amused
      
      rename
      
      Update docs/source/en/api/pipelines/amused.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      AdaLayerNormContinuous default values
      
      custom micro conditioning
      
      micro conditioning docs
      
      put lookup from codebook in constructor
      
      fix conversion script
      
      remove manual fused flash attn kernel
      
      add training script
      
      temp remove training script
      
      add dummy gradient checkpointing func
      
      clarify temperatures is an instance variable by setting it
      
      remove additional SkipFF block args
      
      hardcode norm args
      
      rename tests folder
      
      fix paths and samples
      
      fix tests
      
      add training script
      
      training readme
      
      lora saving and loading
      
      non-lora saving/loading
      
      some readme fixes
      
      guards
      
      Update docs/source/en/api/pipelines/amused.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      Update examples/amused/README.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      Update examples/amused/train_amused.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      vae upcasting
      
      add fp16 integration tests
      
      use tuple for micro cond
      
      copyrights
      
      remove casts
      
      delegate to torch.nn.LayerNorm
      
      move temperature to pipeline call
      
      upsampling/downsampling changes
      40398152
  36. 19 Dec, 2023 1 commit