## Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
Official PyTorch Implementation [![Arxiv](https://img.shields.io/badge/Arxiv-b31b1b.svg)](https://arxiv.org/abs/2407.15642) [![Project Page](https://img.shields.io/badge/Project-Website-blue)](https://maxin-cn.github.io/cinemo_project/) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-yellow)](https://huggingface.co/spaces/maxin-cn/Cinemo) > [**Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models**](https://maxin-cn.github.io/cinemo_project/)
> [Xin Ma](https://maxin-cn.github.io/), [Yaohui Wang*†](https://wyhsirius.github.io/), [Gengyun Jia](https://scholar.google.com/citations?user=_04pkGgAAAAJ&hl=zh-CN), [Xinyuan Chen](https://scholar.google.com/citations?user=3fWSC8YAAAAJ), [Yuan-Fang Li](https://users.monash.edu/~yli/), [Cunjian Chen*](https://cunjian.github.io/), [Yu Qiao](https://scholar.google.com.hk/citations?user=gFtI-8QAAAAJ&hl=zh-CN)
> (*Corresponding authors, †Project Lead) This repo contains pre-trained weights, and sampling code of Cinemo. Please visit our [project page](https://maxin-cn.github.io/cinemo_project/) for more results.
## News - (🔥 New) Jul. 29, 2024. 💥 [HuggingFace space](https://huggingface.co/spaces/maxin-cn/Cinemo) is added, you can also launch [gradio interface ](#gradio-interface) locally. - (🔥 New) Jul. 23, 2024. 💥 Our paper is released on [arxiv](https://arxiv.org/abs/2407.15642). - (🔥 New) Jun. 2, 2024. 💥 The inference code is released. The checkpoint can be found [here](https://huggingface.co/maxin-cn/Cinemo/tree/main). ## Setup Download and set up the repo: ```bash git clone https://github.com/maxin-cn/Cinemo cd Cinemo conda env create -f environment.yml conda activate cinemo ``` ## Animation You can sample from our **pre-trained Cinemo models** with [`animation.py`](pipelines/animation.py). Weights for our pre-trained Cinemo model can be found [here](https://huggingface.co/maxin-cn/Cinemo/tree/main). The script has various arguments for adjusting sampling steps, changing the classifier-free guidance scale, etc: ```bash bash pipelines/animation.sh ``` Related model weights will be downloaded automatically and following results can be obtained,
Input image Output video Input image Output video
"People Walking" "Sea Swell"
"Girl Dancing under the Stars" "Dragon Glowing Eyes"
"Bubbles Floating upwards" "Snowman Waving his Hand"
## Gradio interface We also provide a local gradio interface, just run: ```bash python app.py ``` You can specify the `--share` and `--server_name` arguments to meet your requirement! ## Other Applications You can also utilize Cinemo for other applications, such as motion transfer and video editing: ```bash bash pipelines/video_editing.sh ``` Related checkpoints will be downloaded automatically and following results will be obtained,
Input video First frame Edited first frame Output video
or motion transfer,
Input video First frame Edited first frame Output video
## Contact Us Xin Ma: xin.ma1@monash.edu, Yaohui Wang: wangyaohui@pjlab.org.cn ## Citation If you find this work useful for your research, please consider citing it. ```bibtex @article{ma2024cinemo, title={Cinemo: Latent Diffusion Transformer for Video Generation}, author={Ma, Xin and Wang, Yaohui and Jia, Gengyun and Chen, Xinyuan and Li, Yuan-Fang and Chen, Cunjian and Qiao, Yu}, journal={arXiv preprint arXiv:2407.15642}, year={2024} } ``` ## Acknowledgments Cinemo has been greatly inspired by the following amazing works and teams: [LaVie](https://github.com/Vchitect/LaVie) and [SEINE](https://github.com/Vchitect/SEINE), we thank all the contributors for open-sourcing. ## License The code and model weights are licensed under [LICENSE](LICENSE).