• Sanchit Gandhi's avatar
    Add AudioLDM 2 (#4549) · 7a24977c
    Sanchit Gandhi authored
    
    
    * from audioldm
    
    * unet down + mid
    
    * vae, clap, flan-t5
    
    * start sequence audio mae
    
    * iterate on audioldm encoder
    
    * finish encoder
    
    * finish weight conversion
    
    * text pre-processing
    
    * gpt2 pre-processing
    
    * fix projection model
    
    * working
    
    * unet equivalence
    
    * finish in base
    
    * add unet cond
    
    * finish unet
    
    * finish custom unet
    
    * start clean-up
    
    * revert base unet changes
    
    * refactor pre-processing
    
    * tests: from audioldm
    
    * fix some tests
    
    * more fixes
    
    * iterate on tests
    
    * make fix copies
    
    * harden fast tests
    
    * slow integration tests
    
    * finish tests
    
    * update checkpoint
    
    * update copyright
    
    * docs
    
    * remove outdated method
    
    * add docstring
    
    * make style
    
    * remove decode latents
    
    * enable cpu offload
    
    * (text_encoder_1, tokenizer_1) -> (text_encoder, tokenizer)
    
    * more clean up
    
    * more refactor
    
    * build pr docs
    
    * Update docs/source/en/api/pipelines/audioldm2.md
    Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
    
    * small clean
    
    * tidy conversion
    
    * update for large checkpoint
    
    * generate -> generate_language_model
    
    * full clap model
    
    * shrink clap-audio in tests
    
    * fix large integration test
    
    * fix fast tests
    
    * use generation config
    
    * make style
    
    * update docs
    
    * finish docs
    
    * finish doc
    
    * update tests
    
    * fix last test
    
    * syntax
    
    * finalise tests
    
    * refactor projection model in prep for TTS
    
    * fix fast tests
    
    * style
    
    ---------
    Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
    7a24977c
transformer_2d.py 17.9 KB