• Aryan's avatar
    [core] Mochi T2V (#9769) · 3f329a42
    Aryan authored
    
    
    * update
    
    * udpate
    
    * update transformer
    
    * make style
    
    * fix
    
    * add conversion script
    
    * update
    
    * fix
    
    * update
    
    * fix
    
    * update
    
    * fixes
    
    * make style
    
    * update
    
    * update
    
    * update
    
    * init
    
    * update
    
    * update
    
    * add
    
    * up
    
    * up
    
    * up
    
    * update
    
    * mochi transformer
    
    * remove original implementation
    
    * make style
    
    * update inits
    
    * update conversion script
    
    * docs
    
    * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    
    * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    
    * fix docs
    
    * pipeline fixes
    
    * make style
    
    * invert sigmas in scheduler; fix pipeline
    
    * fix pipeline num_frames
    
    * flip proj and gate in swiglu
    
    * make style
    
    * fix
    
    * make style
    
    * fix tests
    
    * latent mean and std fix
    
    * update
    
    * cherry-pick 1069d210e1b9e84a366cdc7a13965626ea258178
    
    * remove additional sigma already handled by flow match scheduler
    
    * fix
    
    * remove hardcoded value
    
    * replace conv1x1 with linear
    
    * Update src/diffusers/pipelines/mochi/pipeline_mochi.py
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    
    * framewise decoding and conv_cache
    
    * make style
    
    * Apply suggestions from code review
    
    * mochi vae encoder changes
    
    * rebase correctly
    
    * Update scripts/convert_mochi_to_diffusers.py
    
    * fix tests
    
    * fixes
    
    * make style
    
    * update
    
    * make style
    
    * update
    
    * add framewise and tiled encoding
    
    * make style
    
    * make original vae implementation behaviour the default; note: framewise encoding does not work
    
    * remove framewise encoding implementation due to presence of attn layers
    
    * fight test 1
    
    * fight test 2
    
    ---------
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
    3f329a42
__init__.py 35.4 KB