Unverified Commit 5e3b7d2d authored by Bubbliiiing's avatar Bubbliiiing Committed by GitHub
Browse files

Add EasyAnimateV5.1 text-to-video, image-to-video, control-to-video generation model (#10626)



* Update EasyAnimate V5.1

* Add docs && add tests && Fix comments problems in transformer3d and vae

* delete comments and remove useless import

* delete process

* Update EXAMPLE_DOC_STRING

* rename transformer file

* make fix-copies

* make style

* refactor pt. 1

* update toctree.yml

* add model tests

* Update layer_norm for norm_added_q and norm_added_k in Attention

* Fix processor problem

* refactor vae

* Fix problem in comments

* refactor tiling; remove einops dependency

* fix docs path

* make fix-copies

* Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py

* update _toctree.yml

* fix test

* update

* update

* update

* make fix-copies

* fix tests

---------
Co-authored-by: default avatarAryan <aryan@huggingface.co>
Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
parent 7513162b
...@@ -290,6 +290,8 @@ ...@@ -290,6 +290,8 @@
title: CogView4Transformer2DModel title: CogView4Transformer2DModel
- local: api/models/dit_transformer2d - local: api/models/dit_transformer2d
title: DiTTransformer2DModel title: DiTTransformer2DModel
- local: api/models/easyanimate_transformer3d
title: EasyAnimateTransformer3DModel
- local: api/models/flux_transformer - local: api/models/flux_transformer
title: FluxTransformer2DModel title: FluxTransformer2DModel
- local: api/models/hunyuan_transformer2d - local: api/models/hunyuan_transformer2d
...@@ -352,6 +354,8 @@ ...@@ -352,6 +354,8 @@
title: AutoencoderKLHunyuanVideo title: AutoencoderKLHunyuanVideo
- local: api/models/autoencoderkl_ltx_video - local: api/models/autoencoderkl_ltx_video
title: AutoencoderKLLTXVideo title: AutoencoderKLLTXVideo
- local: api/models/autoencoderkl_magvit
title: AutoencoderKLMagvit
- local: api/models/autoencoderkl_mochi - local: api/models/autoencoderkl_mochi
title: AutoencoderKLMochi title: AutoencoderKLMochi
- local: api/models/autoencoder_kl_wan - local: api/models/autoencoder_kl_wan
...@@ -430,6 +434,8 @@ ...@@ -430,6 +434,8 @@
title: DiffEdit title: DiffEdit
- local: api/pipelines/dit - local: api/pipelines/dit
title: DiT title: DiT
- local: api/pipelines/easyanimate
title: EasyAnimate
- local: api/pipelines/flux - local: api/pipelines/flux
title: Flux title: Flux
- local: api/pipelines/control_flux_inpaint - local: api/pipelines/control_flux_inpaint
......
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# AutoencoderKLMagvit
The 3D variational autoencoder (VAE) model with KL loss used in [EasyAnimate](https://github.com/aigc-apps/EasyAnimate) was introduced by Alibaba PAI.
The model can be loaded with the following code snippet.
```python
from diffusers import AutoencoderKLMagvit
vae = AutoencoderKLMagvit.from_pretrained("alibaba-pai/EasyAnimateV5.1-12b-zh", subfolder="vae", torch_dtype=torch.float16).to("cuda")
```
## AutoencoderKLMagvit
[[autodoc]] AutoencoderKLMagvit
- decode
- encode
- all
## AutoencoderKLOutput
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# EasyAnimateTransformer3DModel
A Diffusion Transformer model for 3D data from [EasyAnimate](https://github.com/aigc-apps/EasyAnimate) was introduced by Alibaba PAI.
The model can be loaded with the following code snippet.
```python
from diffusers import EasyAnimateTransformer3DModel
transformer = EasyAnimateTransformer3DModel.from_pretrained("alibaba-pai/EasyAnimateV5.1-12b-zh", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
```
## EasyAnimateTransformer3DModel
[[autodoc]] EasyAnimateTransformer3DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-->
# EasyAnimate
[EasyAnimate](https://github.com/aigc-apps/EasyAnimate) by Alibaba PAI.
The description from it's GitHub page:
*EasyAnimate is a pipeline based on the transformer architecture, designed for generating AI images and videos, and for training baseline models and Lora models for Diffusion Transformer. We support direct prediction from pre-trained EasyAnimate models, allowing for the generation of videos with various resolutions, approximately 6 seconds in length, at 8fps (EasyAnimateV5.1, 1 to 49 frames). Additionally, users can train their own baseline and Lora models for specific style transformations.*
This pipeline was contributed by [bubbliiiing](https://github.com/bubbliiiing). The original codebase can be found [here](https://huggingface.co/alibaba-pai). The original weights can be found under [hf.co/alibaba-pai](https://huggingface.co/alibaba-pai).
There are two official EasyAnimate checkpoints for text-to-video and video-to-video.
| checkpoints | recommended inference dtype |
|:---:|:---:|
| [`alibaba-pai/EasyAnimateV5.1-12b-zh`](https://huggingface.co/alibaba-pai/EasyAnimateV5.1-12b-zh) | torch.float16 |
| [`alibaba-pai/EasyAnimateV5.1-12b-zh-InP`](https://huggingface.co/alibaba-pai/EasyAnimateV5.1-12b-zh-InP) | torch.float16 |
There is one official EasyAnimate checkpoints available for image-to-video and video-to-video.
| checkpoints | recommended inference dtype |
|:---:|:---:|
| [`alibaba-pai/EasyAnimateV5.1-12b-zh-InP`](https://huggingface.co/alibaba-pai/EasyAnimateV5.1-12b-zh-InP) | torch.float16 |
There are two official EasyAnimate checkpoints available for control-to-video.
| checkpoints | recommended inference dtype |
|:---:|:---:|
| [`alibaba-pai/EasyAnimateV5.1-12b-zh-Control`](https://huggingface.co/alibaba-pai/EasyAnimateV5.1-12b-zh-Control) | torch.float16 |
| [`alibaba-pai/EasyAnimateV5.1-12b-zh-Control-Camera`](https://huggingface.co/alibaba-pai/EasyAnimateV5.1-12b-zh-Control-Camera) | torch.float16 |
For the EasyAnimateV5.1 series:
- Text-to-video (T2V) and Image-to-video (I2V) works for multiple resolutions. The width and height can vary from 256 to 1024.
- Both T2V and I2V models support generation with 1~49 frames and work best at this value. Exporting videos at 8 FPS is recommended.
## Quantization
Quantization helps reduce the memory requirements of very large models by storing model weights in a lower precision data type. However, quantization may have varying impact on video quality depending on the video model.
Refer to the [Quantization](../../quantization/overview) overview to learn more about supported quantization backends and selecting a quantization backend that supports your use case. The example below demonstrates how to load a quantized [`EasyAnimatePipeline`] for inference with bitsandbytes.
```py
import torch
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, EasyAnimateTransformer3DModel, EasyAnimatePipeline
from diffusers.utils import export_to_video
quant_config = DiffusersBitsAndBytesConfig(load_in_8bit=True)
transformer_8bit = EasyAnimateTransformer3DModel.from_pretrained(
"alibaba-pai/EasyAnimateV5.1-12b-zh",
subfolder="transformer",
quantization_config=quant_config,
torch_dtype=torch.float16,
)
pipeline = EasyAnimatePipeline.from_pretrained(
"alibaba-pai/EasyAnimateV5.1-12b-zh",
transformer=transformer_8bit,
torch_dtype=torch.float16,
device_map="balanced",
)
prompt = "A cat walks on the grass, realistic style."
negative_prompt = "bad detailed"
video = pipeline(prompt=prompt, negative_prompt=negative_prompt, num_frames=49, num_inference_steps=30).frames[0]
export_to_video(video, "cat.mp4", fps=8)
```
## EasyAnimatePipeline
[[autodoc]] EasyAnimatePipeline
- all
- __call__
## EasyAnimatePipelineOutput
[[autodoc]] pipelines.easyanimate.pipeline_output.EasyAnimatePipelineOutput
...@@ -94,6 +94,7 @@ else: ...@@ -94,6 +94,7 @@ else:
"AutoencoderKLCogVideoX", "AutoencoderKLCogVideoX",
"AutoencoderKLHunyuanVideo", "AutoencoderKLHunyuanVideo",
"AutoencoderKLLTXVideo", "AutoencoderKLLTXVideo",
"AutoencoderKLMagvit",
"AutoencoderKLMochi", "AutoencoderKLMochi",
"AutoencoderKLTemporalDecoder", "AutoencoderKLTemporalDecoder",
"AutoencoderKLWan", "AutoencoderKLWan",
...@@ -109,6 +110,7 @@ else: ...@@ -109,6 +110,7 @@ else:
"ControlNetUnionModel", "ControlNetUnionModel",
"ControlNetXSAdapter", "ControlNetXSAdapter",
"DiTTransformer2DModel", "DiTTransformer2DModel",
"EasyAnimateTransformer3DModel",
"FluxControlNetModel", "FluxControlNetModel",
"FluxMultiControlNetModel", "FluxMultiControlNetModel",
"FluxTransformer2DModel", "FluxTransformer2DModel",
...@@ -293,6 +295,9 @@ else: ...@@ -293,6 +295,9 @@ else:
"CogView4Pipeline", "CogView4Pipeline",
"ConsisIDPipeline", "ConsisIDPipeline",
"CycleDiffusionPipeline", "CycleDiffusionPipeline",
"EasyAnimateControlPipeline",
"EasyAnimateInpaintPipeline",
"EasyAnimatePipeline",
"FluxControlImg2ImgPipeline", "FluxControlImg2ImgPipeline",
"FluxControlInpaintPipeline", "FluxControlInpaintPipeline",
"FluxControlNetImg2ImgPipeline", "FluxControlNetImg2ImgPipeline",
...@@ -620,6 +625,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -620,6 +625,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
AutoencoderKLCogVideoX, AutoencoderKLCogVideoX,
AutoencoderKLHunyuanVideo, AutoencoderKLHunyuanVideo,
AutoencoderKLLTXVideo, AutoencoderKLLTXVideo,
AutoencoderKLMagvit,
AutoencoderKLMochi, AutoencoderKLMochi,
AutoencoderKLTemporalDecoder, AutoencoderKLTemporalDecoder,
AutoencoderKLWan, AutoencoderKLWan,
...@@ -635,6 +641,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -635,6 +641,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
ControlNetUnionModel, ControlNetUnionModel,
ControlNetXSAdapter, ControlNetXSAdapter,
DiTTransformer2DModel, DiTTransformer2DModel,
EasyAnimateTransformer3DModel,
FluxControlNetModel, FluxControlNetModel,
FluxMultiControlNetModel, FluxMultiControlNetModel,
FluxTransformer2DModel, FluxTransformer2DModel,
...@@ -798,6 +805,9 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -798,6 +805,9 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
CogView4Pipeline, CogView4Pipeline,
ConsisIDPipeline, ConsisIDPipeline,
CycleDiffusionPipeline, CycleDiffusionPipeline,
EasyAnimateControlPipeline,
EasyAnimateInpaintPipeline,
EasyAnimatePipeline,
FluxControlImg2ImgPipeline, FluxControlImg2ImgPipeline,
FluxControlInpaintPipeline, FluxControlInpaintPipeline,
FluxControlNetImg2ImgPipeline, FluxControlNetImg2ImgPipeline,
......
...@@ -33,6 +33,7 @@ if is_torch_available(): ...@@ -33,6 +33,7 @@ if is_torch_available():
_import_structure["autoencoders.autoencoder_kl_cogvideox"] = ["AutoencoderKLCogVideoX"] _import_structure["autoencoders.autoencoder_kl_cogvideox"] = ["AutoencoderKLCogVideoX"]
_import_structure["autoencoders.autoencoder_kl_hunyuan_video"] = ["AutoencoderKLHunyuanVideo"] _import_structure["autoencoders.autoencoder_kl_hunyuan_video"] = ["AutoencoderKLHunyuanVideo"]
_import_structure["autoencoders.autoencoder_kl_ltx"] = ["AutoencoderKLLTXVideo"] _import_structure["autoencoders.autoencoder_kl_ltx"] = ["AutoencoderKLLTXVideo"]
_import_structure["autoencoders.autoencoder_kl_magvit"] = ["AutoencoderKLMagvit"]
_import_structure["autoencoders.autoencoder_kl_mochi"] = ["AutoencoderKLMochi"] _import_structure["autoencoders.autoencoder_kl_mochi"] = ["AutoencoderKLMochi"]
_import_structure["autoencoders.autoencoder_kl_temporal_decoder"] = ["AutoencoderKLTemporalDecoder"] _import_structure["autoencoders.autoencoder_kl_temporal_decoder"] = ["AutoencoderKLTemporalDecoder"]
_import_structure["autoencoders.autoencoder_kl_wan"] = ["AutoencoderKLWan"] _import_structure["autoencoders.autoencoder_kl_wan"] = ["AutoencoderKLWan"]
...@@ -72,6 +73,7 @@ if is_torch_available(): ...@@ -72,6 +73,7 @@ if is_torch_available():
_import_structure["transformers.transformer_allegro"] = ["AllegroTransformer3DModel"] _import_structure["transformers.transformer_allegro"] = ["AllegroTransformer3DModel"]
_import_structure["transformers.transformer_cogview3plus"] = ["CogView3PlusTransformer2DModel"] _import_structure["transformers.transformer_cogview3plus"] = ["CogView3PlusTransformer2DModel"]
_import_structure["transformers.transformer_cogview4"] = ["CogView4Transformer2DModel"] _import_structure["transformers.transformer_cogview4"] = ["CogView4Transformer2DModel"]
_import_structure["transformers.transformer_easyanimate"] = ["EasyAnimateTransformer3DModel"]
_import_structure["transformers.transformer_flux"] = ["FluxTransformer2DModel"] _import_structure["transformers.transformer_flux"] = ["FluxTransformer2DModel"]
_import_structure["transformers.transformer_hunyuan_video"] = ["HunyuanVideoTransformer3DModel"] _import_structure["transformers.transformer_hunyuan_video"] = ["HunyuanVideoTransformer3DModel"]
_import_structure["transformers.transformer_ltx"] = ["LTXVideoTransformer3DModel"] _import_structure["transformers.transformer_ltx"] = ["LTXVideoTransformer3DModel"]
...@@ -109,6 +111,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -109,6 +111,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
AutoencoderKLCogVideoX, AutoencoderKLCogVideoX,
AutoencoderKLHunyuanVideo, AutoencoderKLHunyuanVideo,
AutoencoderKLLTXVideo, AutoencoderKLLTXVideo,
AutoencoderKLMagvit,
AutoencoderKLMochi, AutoencoderKLMochi,
AutoencoderKLTemporalDecoder, AutoencoderKLTemporalDecoder,
AutoencoderKLWan, AutoencoderKLWan,
...@@ -144,6 +147,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -144,6 +147,7 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
ConsisIDTransformer3DModel, ConsisIDTransformer3DModel,
DiTTransformer2DModel, DiTTransformer2DModel,
DualTransformer2DModel, DualTransformer2DModel,
EasyAnimateTransformer3DModel,
FluxTransformer2DModel, FluxTransformer2DModel,
HunyuanDiT2DModel, HunyuanDiT2DModel,
HunyuanVideoTransformer3DModel, HunyuanVideoTransformer3DModel,
......
...@@ -274,7 +274,10 @@ class Attention(nn.Module): ...@@ -274,7 +274,10 @@ class Attention(nn.Module):
self.to_add_out = None self.to_add_out = None
if qk_norm is not None and added_kv_proj_dim is not None: if qk_norm is not None and added_kv_proj_dim is not None:
if qk_norm == "fp32_layer_norm": if qk_norm == "layer_norm":
self.norm_added_q = nn.LayerNorm(dim_head, eps=eps, elementwise_affine=elementwise_affine)
self.norm_added_k = nn.LayerNorm(dim_head, eps=eps, elementwise_affine=elementwise_affine)
elif qk_norm == "fp32_layer_norm":
self.norm_added_q = FP32LayerNorm(dim_head, elementwise_affine=False, bias=False, eps=eps) self.norm_added_q = FP32LayerNorm(dim_head, elementwise_affine=False, bias=False, eps=eps)
self.norm_added_k = FP32LayerNorm(dim_head, elementwise_affine=False, bias=False, eps=eps) self.norm_added_k = FP32LayerNorm(dim_head, elementwise_affine=False, bias=False, eps=eps)
elif qk_norm == "rms_norm": elif qk_norm == "rms_norm":
......
...@@ -5,6 +5,7 @@ from .autoencoder_kl_allegro import AutoencoderKLAllegro ...@@ -5,6 +5,7 @@ from .autoencoder_kl_allegro import AutoencoderKLAllegro
from .autoencoder_kl_cogvideox import AutoencoderKLCogVideoX from .autoencoder_kl_cogvideox import AutoencoderKLCogVideoX
from .autoencoder_kl_hunyuan_video import AutoencoderKLHunyuanVideo from .autoencoder_kl_hunyuan_video import AutoencoderKLHunyuanVideo
from .autoencoder_kl_ltx import AutoencoderKLLTXVideo from .autoencoder_kl_ltx import AutoencoderKLLTXVideo
from .autoencoder_kl_magvit import AutoencoderKLMagvit
from .autoencoder_kl_mochi import AutoencoderKLMochi from .autoencoder_kl_mochi import AutoencoderKLMochi
from .autoencoder_kl_temporal_decoder import AutoencoderKLTemporalDecoder from .autoencoder_kl_temporal_decoder import AutoencoderKLTemporalDecoder
from .autoencoder_kl_wan import AutoencoderKLWan from .autoencoder_kl_wan import AutoencoderKLWan
......
This diff is collapsed.
...@@ -19,6 +19,7 @@ if is_torch_available(): ...@@ -19,6 +19,7 @@ if is_torch_available():
from .transformer_allegro import AllegroTransformer3DModel from .transformer_allegro import AllegroTransformer3DModel
from .transformer_cogview3plus import CogView3PlusTransformer2DModel from .transformer_cogview3plus import CogView3PlusTransformer2DModel
from .transformer_cogview4 import CogView4Transformer2DModel from .transformer_cogview4 import CogView4Transformer2DModel
from .transformer_easyanimate import EasyAnimateTransformer3DModel
from .transformer_flux import FluxTransformer2DModel from .transformer_flux import FluxTransformer2DModel
from .transformer_hunyuan_video import HunyuanVideoTransformer3DModel from .transformer_hunyuan_video import HunyuanVideoTransformer3DModel
from .transformer_ltx import LTXVideoTransformer3DModel from .transformer_ltx import LTXVideoTransformer3DModel
......
This diff is collapsed.
...@@ -216,6 +216,11 @@ else: ...@@ -216,6 +216,11 @@ else:
"IFPipeline", "IFPipeline",
"IFSuperResolutionPipeline", "IFSuperResolutionPipeline",
] ]
_import_structure["easyanimate"] = [
"EasyAnimatePipeline",
"EasyAnimateInpaintPipeline",
"EasyAnimateControlPipeline",
]
_import_structure["hunyuandit"] = ["HunyuanDiTPipeline"] _import_structure["hunyuandit"] = ["HunyuanDiTPipeline"]
_import_structure["hunyuan_video"] = ["HunyuanVideoPipeline", "HunyuanSkyreelsImageToVideoPipeline"] _import_structure["hunyuan_video"] = ["HunyuanVideoPipeline", "HunyuanSkyreelsImageToVideoPipeline"]
_import_structure["kandinsky"] = [ _import_structure["kandinsky"] = [
...@@ -546,6 +551,11 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT: ...@@ -546,6 +551,11 @@ if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
VersatileDiffusionTextToImagePipeline, VersatileDiffusionTextToImagePipeline,
VQDiffusionPipeline, VQDiffusionPipeline,
) )
from .easyanimate import (
EasyAnimateControlPipeline,
EasyAnimateInpaintPipeline,
EasyAnimatePipeline,
)
from .flux import ( from .flux import (
FluxControlImg2ImgPipeline, FluxControlImg2ImgPipeline,
FluxControlInpaintPipeline, FluxControlInpaintPipeline,
......
from typing import TYPE_CHECKING
from ...utils import (
DIFFUSERS_SLOW_IMPORT,
OptionalDependencyNotAvailable,
_LazyModule,
get_objects_from_module,
is_torch_available,
is_transformers_available,
)
_dummy_objects = {}
_import_structure = {}
try:
if not (is_transformers_available() and is_torch_available()):
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from ...utils import dummy_torch_and_transformers_objects # noqa F403
_dummy_objects.update(get_objects_from_module(dummy_torch_and_transformers_objects))
else:
_import_structure["pipeline_easyanimate"] = ["EasyAnimatePipeline"]
_import_structure["pipeline_easyanimate_control"] = ["EasyAnimateControlPipeline"]
_import_structure["pipeline_easyanimate_inpaint"] = ["EasyAnimateInpaintPipeline"]
if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
try:
if not (is_transformers_available() and is_torch_available()):
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from ...utils.dummy_torch_and_transformers_objects import *
else:
from .pipeline_easyanimate import EasyAnimatePipeline
from .pipeline_easyanimate_control import EasyAnimateControlPipeline
from .pipeline_easyanimate_inpaint import EasyAnimateInpaintPipeline
else:
import sys
sys.modules[__name__] = _LazyModule(
__name__,
globals()["__file__"],
_import_structure,
module_spec=__spec__,
)
for name, value in _dummy_objects.items():
setattr(sys.modules[__name__], name, value)
This diff is collapsed.
from dataclasses import dataclass
import torch
from diffusers.utils import BaseOutput
@dataclass
class EasyAnimatePipelineOutput(BaseOutput):
r"""
Output class for EasyAnimate pipelines.
Args:
frames (`torch.Tensor`, `np.ndarray`, or List[List[PIL.Image.Image]]):
List of video outputs - It can be a nested list of length `batch_size,` with each sub-list containing
denoised PIL image sequences of length `num_frames.` It can also be a NumPy array or Torch tensor of shape
`(batch_size, num_frames, channels, height, width)`.
"""
frames: torch.Tensor
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment