1. 29 Aug, 2023 5 commits
    • zideliu's avatar
      fix typo (#4822) · 0699ac62
      zideliu authored
      0699ac62
    • Patrick von Platen's avatar
      make style · a76f2ad5
      Patrick von Platen authored
      a76f2ad5
    • VitjanZ's avatar
      Support saving multiple t2i adapter models under one checkpoint (#4798) · 7200daa4
      VitjanZ authored
      
      
      * adding save and load for MultiAdapter, adding test
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Adding changes from review test_stable_diffusion_adapter
      
      * import sorting fix
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      7200daa4
    • Patrick von Platen's avatar
      Fuse loras (#4473) · c583f3b4
      Patrick von Platen authored
      
      
      * Fuse loras
      
      * initial implementation.
      
      * add slow test one.
      
      * styling
      
      * add: test for checking efficiency
      
      * print
      
      * position
      
      * place model offload correctly
      
      * style
      
      * style.
      
      * unfuse test.
      
      * final checks
      
      * remove warning test
      
      * remove warnings altogether
      
      * debugging
      
      * tighten up tests.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * denugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debuging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * suit up the generator initialization a bit.
      
      * remove print
      
      * update assertion.
      
      * debugging
      
      * remove print.
      
      * fix: assertions.
      
      * style
      
      * can generator be a problem?
      
      * generator
      
      * correct tests.
      
      * support text encoder lora fusion.
      
      * tighten up tests.
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      c583f3b4
    • Chong Mou's avatar
      add models for T2I-Adapter-XL (#4696) · 12358b98
      Chong Mou authored
      
      
      * T2I-Adapter-XL
      
      * update
      
      * update
      
      * add pipeline
      
      * modify pipeline
      
      * modify pipeline
      
      * modify pipeline
      
      * modify pipeline
      
      * modify pipeline
      
      * modify modeling_text_unet
      
      * fix styling.
      
      * fix: copies.
      
      * adapter settings
      
      * new test case
      
      * new test case
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * revert prints.
      
      * new test case
      
      * remove print
      
      * org test case
      
      * add test_pipeline
      
      * styling.
      
      * fix copies.
      
      * modify test parameter
      
      * style.
      
      * add adapter-xl doc
      
      * double quotes in docs
      
      * Fix potential type mismatch
      
      * style.
      
      ---------
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      12358b98
  2. 28 Aug, 2023 6 commits
  3. 26 Aug, 2023 4 commits
  4. 25 Aug, 2023 10 commits
  5. 24 Aug, 2023 5 commits
  6. 23 Aug, 2023 4 commits
    • YiYi Xu's avatar
      add a step_index counter (#4347) · cd21b965
      YiYi Xu authored
      
      
      add self.step_index
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      cd21b965
    • Suraj Patil's avatar
      fix dummy import for AudioLDM2 (#4741) · 709a6428
      Suraj Patil authored
      * fix import
      
      * style
      709a6428
    • Sanchit Gandhi's avatar
      [AudioLDM Docs] Update docstring (#4744) · 0a0fe69a
      Sanchit Gandhi authored
      0a0fe69a
    • Ollin Boer Bohan's avatar
      Fix AutoencoderTiny encoder scaling convention (#4682) · 052bf328
      Ollin Boer Bohan authored
      * Fix AutoencoderTiny encoder scaling convention
      
        * Add [-1, 1] -> [0, 1] rescaling to EncoderTiny
      
        * Move [0, 1] -> [-1, 1] rescaling from AutoencoderTiny.decode to DecoderTiny
          (i.e. immediately after the final conv, as early as possible)
      
        * Fix missing [0, 255] -> [0, 1] rescaling in AutoencoderTiny.forward
      
        * Update AutoencoderTinyIntegrationTests to protect against scaling issues.
          The new test constructs a simple image, round-trips it through AutoencoderTiny,
          and confirms the decoded result is approximately equal to the source image.
          This test checks behavior with and without tiling enabled.
          This test will fail if new AutoencoderTiny scaling issues are introduced.
      
        * Context: Raw TAESD weights expect images in [0, 1], but diffusers'
          convention represents images with zero-centered values in [-1, 1],
          so AutoencoderTiny needs to scale / unscale images at the start of
          encoding and at the end of decoding in order to work with diffusers.
      
      * Re-add existing AutoencoderTiny test, update golden values
      
      * Add comments to AutoencoderTiny.forward
      052bf328
  7. 22 Aug, 2023 5 commits
  8. 21 Aug, 2023 1 commit
    • Sanchit Gandhi's avatar
      Add AudioLDM 2 (#4549) · 7a24977c
      Sanchit Gandhi authored
      
      
      * from audioldm
      
      * unet down + mid
      
      * vae, clap, flan-t5
      
      * start sequence audio mae
      
      * iterate on audioldm encoder
      
      * finish encoder
      
      * finish weight conversion
      
      * text pre-processing
      
      * gpt2 pre-processing
      
      * fix projection model
      
      * working
      
      * unet equivalence
      
      * finish in base
      
      * add unet cond
      
      * finish unet
      
      * finish custom unet
      
      * start clean-up
      
      * revert base unet changes
      
      * refactor pre-processing
      
      * tests: from audioldm
      
      * fix some tests
      
      * more fixes
      
      * iterate on tests
      
      * make fix copies
      
      * harden fast tests
      
      * slow integration tests
      
      * finish tests
      
      * update checkpoint
      
      * update copyright
      
      * docs
      
      * remove outdated method
      
      * add docstring
      
      * make style
      
      * remove decode latents
      
      * enable cpu offload
      
      * (text_encoder_1, tokenizer_1) -> (text_encoder, tokenizer)
      
      * more clean up
      
      * more refactor
      
      * build pr docs
      
      * Update docs/source/en/api/pipelines/audioldm2.md
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * small clean
      
      * tidy conversion
      
      * update for large checkpoint
      
      * generate -> generate_language_model
      
      * full clap model
      
      * shrink clap-audio in tests
      
      * fix large integration test
      
      * fix fast tests
      
      * use generation config
      
      * make style
      
      * update docs
      
      * finish docs
      
      * finish doc
      
      * update tests
      
      * fix last test
      
      * syntax
      
      * finalise tests
      
      * refactor projection model in prep for TTS
      
      * fix fast tests
      
      * style
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      7a24977c