Commits · f295e2eefcebf21781f888b407eefadb5e121f7b · renzhc / diffusers_dcu

04 Dec, 2024 1 commit

[tests] refactor vae tests (#9808) · c1926cef

Sayak Paul authored Dec 04, 2024



* add: autoencoderkl tests

* autoencodertiny.

* fix

* asymmetric autoencoder.

* more

* integration tests for stable audio decoder.

* consistency decoder vae tests

* remove grad check from consistency decoder.

* cog

* bye test_models_vae.py

* fix

* fix

* remove allegro

* fixes

* fixes

* fixes

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

c1926cef

18 Nov, 2024 1 commit

CogVideoX 1.5 (#9877) · 3b283061

Yuxuan.Zhang authored Nov 19, 2024



* CogVideoX1_1PatchEmbed test

* 1360 * 768

* refactor

* make style

* update docs

* add modeling tests for cogvideox 1.5

* update

* make fix-copies

* add ofs embed(for convert)

* add ofs embed(for convert)

* more resolution for cogvideox1.5-5b-i2v

* use even number of latent frames only

* update pipeline implementations

* make style

* set patch_size_t as None by default

* #skip frames 0

* refactor

* make style

* update docs

* fix ofs_embed

* update docs

* invert_scale_latents

* update

* fix

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/cogvideox_transformer_3d.py

* update conversion script

* remove copied from

* fix test

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

3b283061

08 Nov, 2024 1 commit
- Enabling gradient checkpointing in eval() mode (#9878) · 5b972fbd
  Michael Tkachuk authored Nov 08, 2024
```
* refactored
```
  5b972fbd
05 Nov, 2024 1 commit

[core] Mochi T2V (#9769) · 3f329a42

Aryan authored Nov 05, 2024



* update

* udpate

* update transformer

* make style

* fix

* add conversion script

* update

* fix

* update

* fix

* update

* fixes

* make style

* update

* update

* update

* init

* update

* update

* add

* up

* up

* up

* update

* mochi transformer

* remove original implementation

* make style

* update inits

* update conversion script

* docs

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix docs

* pipeline fixes

* make style

* invert sigmas in scheduler; fix pipeline

* fix pipeline num_frames

* flip proj and gate in swiglu

* make style

* fix

* make style

* fix tests

* latent mean and std fix

* update

* cherry-pick 1069d210e1b9e84a366cdc7a13965626ea258178

* remove additional sigma already handled by flow match scheduler

* fix

* remove hardcoded value

* replace conv1x1 with linear

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* framewise decoding and conv_cache

* make style

* Apply suggestions from code review

* mochi vae encoder changes

* rebase correctly

* Update scripts/convert_mochi_to_diffusers.py

* fix tests

* fixes

* make style

* update

* make style

* update

* add framewise and tiled encoding

* make style

* make original vae implementation behaviour the default; note: framewise encoding does not work

* remove framewise encoding implementation due to presence of attn layers

* fight test 1

* fight test 2

---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

3f329a42

16 Oct, 2024 1 commit

[core] improve VAE encode/decode framewise batching (#9684) · d204e532

Aryan authored Oct 16, 2024



* update

* apply suggestions from review

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

d204e532

02 Oct, 2024 1 commit
- fix cogvideox autoencoder decode (#9569) · 7f323f0f
  Xiangchendong authored Oct 03, 2024
```
Co-authored-by: Aryan <aryan@huggingface.co>
```
  7f323f0f
28 Sep, 2024 1 commit

[refactor] remove conv_cache from CogVideoX VAE (#9524) · bd4df285

Aryan authored Sep 28, 2024



* remove conv cache from the layer and pass as arg instead

* make style

* yiyi's cleaner implementation
Co-Authored-By: YiYi Xu <yixu310@gmail.com>

* sayak's compiled implementation
Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com>

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

bd4df285

16 Sep, 2024 1 commit

CogVideoX-5b-I2V support (#9418) · 8336405e

Yuxuan.Zhang authored Sep 16, 2024



* draft Init

* draft

* vae encode image

* make style

* image latents preparation

* remove image encoder from conversion script

* fix minor bugs

* make pipeline work

* make style

* remove debug prints

* fix imports

* update example

* make fix-copies

* add fast tests

* fix import

* update vae

* update docs

* update image link

* apply suggestions from review

* apply suggestions from review

* add slow test

* make use of learned positional embeddings

* apply suggestions from review

* doc change

* Update convert_cogvideox_to_diffusers.py

* make style

* final changes

* make style

* fix tests

---------
Co-authored-by: Aryan <aryan@huggingface.co>

8336405e

02 Sep, 2024 1 commit

[core] CogVideoX memory optimizations in VAE encode (#9340) · af6c0fb7

Aryan authored Sep 02, 2024

fake context parallel cache, vae encode tiling

(cherry picked from commit bf890bca0e8aed875d6a207f9b826ce894901522)

af6c0fb7

23 Aug, 2024 1 commit

Cogvideox-5B Model adapter change (#9203) · 960c149c

zR authored Aug 23, 2024



* draft of embedding

---------
Co-authored-by: Aryan <aryan@huggingface.co>

960c149c

13 Aug, 2024 1 commit

[refactor] CogVideoX followups + tiled decoding support (#9150) · a85b34e7

Aryan authored Aug 14, 2024

* refactor context parallel cache; update torch compile time benchmark

* add tiling support

* make style

* remove num_frames % 8 == 0 requirement

* update default num_frames to original value

* add explanations + refactor

* update torch compile example

* update docs

* update

* clean up if-statements

* address review comments

* add test for vae tiling

* update docs

* update docs

* update docstrings

* add modeling test for cogvideox transformer

* make style

a85b34e7

07 Aug, 2024 1 commit

Add CogVideoX text-to-video generation model (#9082) · 2dad462d

zR authored Aug 07, 2024



* add CogVideoX

---------
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2dad462d