Unverified Commit 9b6fd118 authored by Musisoul's avatar Musisoul Committed by GitHub
Browse files

[Feat] entrypoint like diffusers (#475)



### 单卡
```bash
python examples/simple_launch.py
```
```python
# examples/simple_launch.py
from lightx2v import LightGenerator

generator = LightGenerator(
    model_path="/path/to/Wan2.1-T2V-1.3B",
    model_cls="wan2.1",
    task="t2v",
)

video_path = generator.generate(
    prompt="Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.",
    negative_prompt="镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走",
    seed=42,
    save_result_path="output.mp4",
)
```
### 多卡
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
torchrun --nproc_per_node=8 examples/multi_launch.py
```

---------
Co-authored-by: default avatargushiqiao <975033167@qq.com>
parent d996a81c
......@@ -20,12 +20,66 @@
## :fire: Latest News
- **November 21, 2025:** 🚀 We support the [HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5) video generation model since Day 0. With the same number of GPUs, LightX2V can achieve a speed improvement of over 2 times and supports deployment on GPUs with lower memory (such as the 24GB RTX 4090). It also supports CFG/Ulysses parallelism, efficient offloading, TeaCache/MagCache technologies, and more. We will soon update our models on our [HuggingFace page](https://huggingface.co/lightx2v), including quantization, step distillation, VAE distillation, and other related models. Refer to [this](https://github.com/ModelTC/LightX2V/tree/main/scripts/hunyuan_video_15) for usage tutorials.
- **November 21, 2025:** 🚀 We support the [HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5) video generation model since Day 0. With the same number of GPUs, LightX2V can achieve a speed improvement of over 2 times and supports deployment on GPUs with lower memory (such as the 24GB RTX 4090). It also supports CFG/Ulysses parallelism, efficient offloading, TeaCache/MagCache technologies, and more. It also supports deployment on domestic chips such as Muxi and Cambricon. Quantized models and lightweight VAE models are now available: [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models) for quantized inference, and [LightTAE for HunyuanVideo-1.5](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors) for fast VAE decoding. We will soon update more models on our [HuggingFace page](https://huggingface.co/lightx2v), including step distillation, VAE distillation, and other related models. Refer to [this](https://github.com/ModelTC/LightX2V/tree/main/scripts/hunyuan_video_15) for usage tutorials, or check out the [examples directory](https://github.com/ModelTC/LightX2V/tree/main/examples) for code examples.
## 💡 Quick Start
For comprehensive usage instructions, please refer to our documentation: **[English Docs](https://lightx2v-en.readthedocs.io/en/latest/) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/)**
### Installation from Git
```bash
pip install -v git+https://github.com/ModelTC/LightX2V.git
```
### Building from Source
```bash
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v . # pip install -v .
```
### (Optional) Install Attention/Quantize Operators
For attention operators installation, please refer to our documentation: **[English Docs](https://lightx2v-en.readthedocs.io/en/latest/getting_started/quickstart.html#step-4-install-attention-operators) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html#id9)**
### Quick Start
```python
# examples/hunyuan_video/hunyuan_t2v.py
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/ckpts/hunyuanvideo-1.5/",
model_cls="hunyuan_video_1.5",
transformer_model_name="720p_t2v",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
num_frames=121,
guidance_scale=6.0,
sample_shift=9.0,
aspect_ratio="16:9",
fps=24,
)
seed = 123
prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
> 💡 **More Examples**: For more usage examples including quantization, offloading, caching, and other advanced configurations, please refer to the [examples directory](https://github.com/ModelTC/LightX2V/tree/main/examples).
## 🤖 Supported Model Ecosystem
......@@ -37,7 +91,6 @@ For comprehensive usage instructions, please refer to our documentation: **[Engl
-[Qwen-Image-Edit-2509](https://huggingface.co/Qwen/Qwen-Image-Edit-2509)
### Quantized and Distilled Models/LoRAs (**🚀 Recommended: 4-step inference**)
-[Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models)
-[Wan2.1-Distill-Models](https://huggingface.co/lightx2v/Wan2.1-Distill-Models)
-[Wan2.2-Distill-Models](https://huggingface.co/lightx2v/Wan2.2-Distill-Models)
-[Wan2.1-Distill-Loras](https://huggingface.co/lightx2v/Wan2.1-Distill-Loras)
......@@ -45,7 +98,6 @@ For comprehensive usage instructions, please refer to our documentation: **[Engl
### Lightweight Autoencoder Models (**🚀 Recommended: fast inference & low memory usage**)
-[Autoencoders](https://huggingface.co/lightx2v/Autoencoders)
🔔 Follow our [HuggingFace page](https://huggingface.co/lightx2v) for the latest model releases from our team.
### Autoregressive Models
......
......@@ -20,13 +20,66 @@
## :fire: 最新动态
- **2025年11月21日:** 🚀 我们Day0支持了[HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5)的视频生成模型,同样GPU数量,LightX2V可带来约2倍以上的速度提升,并支持更低显存GPU部署(如24G RTX4090)。支持CFG并行/Ulysses并行,高效Offload,TeaCache/MagCache等技术。同时支持沐曦,寒武纪等国产芯片部署。我们很快将在我们的[HuggingFace主页](https://huggingface.co/lightx2v)更新量化,步数蒸馏,VAE蒸馏等相关模型。使用教程参考[这里](https://github.com/ModelTC/LightX2V/tree/main/scripts/hunyuan_video_15)
- **2025年11月21日:** 🚀 我们Day0支持了[HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5)的视频生成模型,同样GPU数量,LightX2V可带来约2倍以上的速度提升,并支持更低显存GPU部署(如24G RTX4090)。支持CFG并行/Ulysses并行,高效Offload,TeaCache/MagCache等技术。同时支持沐曦,寒武纪等国产芯片部署。量化模型和轻量VAE模型现已可用:[Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models)用于量化推理,[HunyuanVideo-1.5轻量TAE](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors)用于快速VAE解码。我们很快将在我们的[HuggingFace主页](https://huggingface.co/lightx2v)更新更多模型,包括步数蒸馏,VAE蒸馏等相关模型。使用教程参考[这里](https://github.com/ModelTC/LightX2V/tree/main/scripts/hunyuan_video_15),或查看[示例目录](https://github.com/ModelTC/LightX2V/tree/main/examples)获取代码示例
## 💡 快速开始
详细使用说明请参考我们的文档:**[英文文档](https://lightx2v-en.readthedocs.io/en/latest/) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/)**
### 从 Git 安装
```bash
pip install -v git+https://github.com/ModelTC/LightX2V.git
```
### 从源码构建
```bash
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v . # pip install -v .
```
### (可选)安装注意力/量化算子
注意力算子安装说明请参考我们的文档:**[英文文档](https://lightx2v-en.readthedocs.io/en/latest/getting_started/quickstart.html#step-4-install-attention-operators) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html#id9)**
### 快速开始
```python
# examples/hunyuan_video/hunyuan_t2v.py
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/ckpts/hunyuanvideo-1.5/",
model_cls="hunyuan_video_1.5",
transformer_model_name="720p_t2v",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
num_frames=121,
guidance_scale=6.0,
sample_shift=9.0,
aspect_ratio="16:9",
fps=24,
)
seed = 123
prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
> 💡 **更多示例**: 更多使用案例,包括量化、卸载、缓存等进阶配置,请参考 [examples 目录](https://github.com/ModelTC/LightX2V/tree/main/examples)。
## 🤖 支持的模型生态
### 官方开源模型
......
......@@ -102,13 +102,46 @@ git clone https://github.com/thu-ml/SageAttention.git
cd SageAttention && CUDA_ARCHITECTURES="8.0,8.6,8.9,9.0,12.0" EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32 pip install -v -e .
```
**Option D: Q8 Kernels**
#### Step 4: Install Quantization Operators (Optional)
Quantization operators are used to support model quantization, which can significantly reduce memory usage and accelerate inference. Choose the appropriate quantization operator based on your needs:
**Option A: VLLM Kernels (Recommended)**
Suitable for various quantization schemes, supports FP8 and other quantization formats.
```bash
pip install vllm
```
Or install from source for the latest features:
```bash
git clone https://github.com/vllm-project/vllm.git
cd vllm
uv pip install -e .
```
**Option B: SGL Kernels**
Suitable for SGL quantization scheme, requires torch == 2.8.0.
```bash
pip install sgl-kernel --upgrade
```
**Option C: Q8 Kernels**
Suitable for Ada architecture GPUs (such as RTX 4090, L40S, etc.).
```bash
git clone https://github.com/KONAKONA666/q8_kernels.git
cd q8_kernels && git submodule init && git submodule update
python setup.py install
```
> 💡 **Note**:
> - You can skip this step if you don't need quantization functionality
> - Quantized models can be downloaded from [LightX2V HuggingFace](https://huggingface.co/lightx2v)
> - For more quantization information, please refer to the [Quantization Documentation](method_tutorials/quantization.html)
#### Step 5: Verify Installation
```python
......@@ -215,8 +248,27 @@ cd LightX2V
# Install Windows-specific dependencies
pip install -r requirements_win.txt
pip install -v -e .
```
#### Step 7: Install Quantization Operators (Optional)
Quantization operators are used to support model quantization, which can significantly reduce memory usage and accelerate inference.
**Install VLLM (Recommended):**
Download the corresponding wheel package from [vllm-windows releases](https://github.com/SystemPanic/vllm-windows/releases) and install it.
```cmd
# Install vLLM (please adjust according to actual filename)
pip install vllm-0.9.1+cu124-cp312-cp312-win_amd64.whl
```
> 💡 **Note**:
> - You can skip this step if you don't need quantization functionality
> - Quantized models can be downloaded from [LightX2V HuggingFace](https://huggingface.co/lightx2v)
> - For more quantization information, please refer to the [Quantization Documentation](method_tutorials/quantization.html)
## 🎯 Inference Usage
### 📥 Model Preparation
......@@ -249,6 +301,42 @@ bash scripts/wan/run_wan_t2v.sh
scripts\win\run_wan_t2v.bat
```
#### Python Script Launch
```python
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480, # 720
width=832, # 1280
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
> 💡 **More Examples**: For more usage examples including quantization, offloading, caching, and other advanced configurations, please refer to the [examples directory](https://github.com/ModelTC/LightX2V/tree/main/examples).
## 📞 Get Help
If you encounter problems during installation or usage, please:
......
......@@ -83,7 +83,6 @@ conda activate lightx2v
pip install -v -e .
```
#### 步骤 4: 安装注意力机制算子
**选项 A: Flash Attention 2**
......@@ -103,13 +102,46 @@ git clone https://github.com/thu-ml/SageAttention.git
cd SageAttention && CUDA_ARCHITECTURES="8.0,8.6,8.9,9.0,12.0" EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32 pip install -v -e .
```
**选项 D: Q8 Kernels**
#### 步骤 4: 安装量化算子(可选)
量化算子用于支持模型量化功能,可以显著降低显存占用并加速推理。根据您的需求选择合适的量化算子:
**选项 A: VLLM Kernels(推荐)**
适用于多种量化方案,支持 FP8 等量化格式。
```bash
pip install vllm
```
或者从源码安装以获得最新功能:
```bash
git clone https://github.com/vllm-project/vllm.git
cd vllm
uv pip install -e .
```
**选项 B: SGL Kernels**
适用于 SGL 量化方案,需要 torch == 2.8.0。
```bash
pip install sgl-kernel --upgrade
```
**选项 C: Q8 Kernels**
适用于 Ada 架构显卡(如 RTX 4090、L40S 等)。
```bash
git clone https://github.com/KONAKONA666/q8_kernels.git
cd q8_kernels && git submodule init && git submodule update
python setup.py install
```
> 💡 **提示**:
> - 如果不需要使用量化功能,可以跳过此步骤
> - 量化模型可以从 [LightX2V HuggingFace](https://huggingface.co/lightx2v) 下载
> - 更多量化相关信息请参考 [量化文档](method_tutorials/quantization.html)
#### 步骤 5: 验证安装
```python
import lightx2v
......@@ -215,6 +247,31 @@ cd LightX2V
# 安装 Windows 专用依赖
pip install -r requirements_win.txt
pip install -v -e .
```
#### 步骤 7: 安装量化算子(可选)
量化算子用于支持模型量化功能,可以显著降低显存占用并加速推理。
**安装 VLLM(推荐):**
[vllm-windows releases](https://github.com/SystemPanic/vllm-windows/releases) 下载对应的 wheel 包并安装。
```cmd
# 安装 vLLM(请根据实际文件名调整)
pip install vllm-0.9.1+cu124-cp312-cp312-win_amd64.whl
```
> 💡 **提示**:
> - 如果不需要使用量化功能,可以跳过此步骤
> - 量化模型可以从 [LightX2V HuggingFace](https://huggingface.co/lightx2v) 下载
> - 更多量化相关信息请参考 [量化文档](method_tutorials/quantization.html)
#### 步骤 8: 验证安装
```python
import lightx2v
print(f"LightX2V 版本: {lightx2v.__version__}")
```
## 🎯 推理使用
......@@ -248,6 +305,40 @@ bash scripts/wan/run_wan_t2v.sh
# 使用 Windows 批处理脚本
scripts\win\run_wan_t2v.bat
```
#### Python脚本启动
```python
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480, # 720
width=832, # 1280
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
## 📞 获取帮助
......
# LightX2V Usage Examples
This document introduces how to use LightX2V for video generation, including basic usage and advanced configurations.
## 📋 Table of Contents
- [Environment Setup](#environment-setup)
- [Basic Usage Examples](#basic-usage-examples)
- [Model Path Configuration](#model-path-configuration)
- [Creating Generator](#creating-generator)
- [Advanced Configurations](#advanced-configurations)
- [Parameter Offloading](#parameter-offloading)
- [Model Quantization](#model-quantization)
- [Parallel Inference](#parallel-inference)
- [Feature Caching](#feature-caching)
- [Lightweight VAE](#lightweight-vae)
## 🔧 Environment Setup
Please refer to the main project's [Quick Start Guide](../docs/EN/source/getting_started/quickstart.md) for environment setup.
## 🚀 Basic Usage Examples
A minimal code example can be found in `examples/wan_t2v.py`:
```python
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480,
width=832,
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Your prompt here"
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
## 📁 Model Path Configuration
### Basic Configuration
Pass the model path to `LightX2VPipeline`:
```python
pipe = LightX2VPipeline(
image_path="/path/to/img_0.jpg", # Required for I2V tasks
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe", # For wan2.1, use "wan2.1"
task="i2v",
)
```
### Specifying Multiple Model Weight Versions
When there are multiple versions of bf16 precision DIT model safetensors files in the `model_path` directory, you need to use the following parameters to specify which weights to use:
- **`dit_original_ckpt`**: Used to specify the original DIT weight path for models like wan2.1 and hunyuan15
- **`low_noise_original_ckpt`**: Used to specify the low noise branch weight path for wan2.2 models
- **`high_noise_original_ckpt`**: Used to specify the high noise branch weight path for wan2.2 models
**Usage Example:**
```python
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe",
task="i2v",
low_noise_original_ckpt="/path/to/low_noise_model.safetensors",
high_noise_original_ckpt="/path/to/high_noise_model.safetensors",
)
```
## 🎛️ Creating Generator
### Loading from Configuration File
The generator can be loaded directly from a JSON configuration file. Configuration files are located in the `configs` directory:
```python
pipe.create_generator(config_json="../configs/wan/wan_t2v.json")
```
### Creating Generator Manually
You can also create the generator manually and configure multiple parameters:
```python
pipe.create_generator(
attn_mode="flash_attn2", # Options: flash_attn2, flash_attn3, sage_attn2, sage_attn3 (B-architecture GPUs)
infer_steps=50, # Number of inference steps
num_frames=81, # Number of video frames
height=480, # Video height
width=832, # Video width
guidance_scale=5.0, # CFG guidance strength (CFG disabled when =1)
sample_shift=5.0, # Sample shift
fps=16, # Frame rate
aspect_ratio="16:9", # Aspect ratio
boundary=0.900, # Boundary value
boundary_step_index=2, # Boundary step index
denoising_step_list=[1000, 750, 500, 250], # Denoising step list
)
```
**Parameter Description:**
- **Resolution**: Specified via `height` and `width`
- **CFG**: Specified via `guidance_scale` (set to 1 to disable CFG)
- **FPS**: Specified via `fps`
- **Video Length**: Specified via `num_frames`
- **Inference Steps**: Specified via `infer_steps`
- **Sample Shift**: Specified via `sample_shift`
- **Attention Mode**: Specified via `attn_mode`, options include `flash_attn2`, `flash_attn3`, `sage_attn2`, `sage_attn3` (for B-architecture GPUs)
## ⚙️ Advanced Configurations
**⚠️ Important: When manually creating a generator, you can configure some advanced options. All advanced configurations must be specified before `create_generator()`, otherwise they will not take effect!**
### Parameter Offloading
Significantly reduces memory usage with almost no impact on inference speed. Suitable for RTX 30/40/50 series GPUs.
```python
pipe.enable_offload(
cpu_offload=True, # Enable CPU offloading
offload_granularity="block", # Offload granularity: "block" or "phase"
text_encoder_offload=False, # Whether to offload text encoder
image_encoder_offload=False, # Whether to offload image encoder
vae_offload=False, # Whether to offload VAE
)
```
**Notes:**
- For Wan models, `offload_granularity` supports both `"block"` and `"phase"`
- For HunyuanVideo-1.5, only `"block"` is currently supported
### Model Quantization
Quantization can significantly reduce memory usage and accelerate inference.
```python
pipe.enable_quantize(
dit_quantized=False, # Whether to use quantized DIT model
text_encoder_quantized=False, # Whether to use quantized text encoder
image_encoder_quantized=False, # Whether to use quantized image encoder
dit_quantized_ckpt=None, # DIT quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
low_noise_quantized_ckpt=None, # Wan2.2 low noise branch quantized weight path
high_noise_quantized_ckpt=None, # Wan2.2 high noise branch quantized weight path
text_encoder_quantized_ckpt=None, # Text encoder quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
image_encoder_quantized_ckpt=None, # Image encoder quantized weight path (required when model_path doesn't contain quantized weights or has multiple weight files)
quant_scheme="fp8-sgl", # Quantization scheme
)
```
**Parameter Description:**
- **`dit_quantized_ckpt`**: When the `model_path` directory doesn't contain quantized weights, or has multiple weight files, you need to specify the specific DIT quantized weight path
- **`text_encoder_quantized_ckpt`** and **`image_encoder_quantized_ckpt`**: Similarly, used to specify encoder quantized weight paths
- **`low_noise_quantized_ckpt`** and **`high_noise_quantized_ckpt`**: Used to specify dual-branch quantized weights for Wan2.2 models
**Quantized Model Downloads:**
- **Wan-2.1 Quantized Models**: Download from [Wan2.1-Distill-Models](https://huggingface.co/lightx2v/Wan2.1-Distill-Models)
- **Wan-2.2 Quantized Models**: Download from [Wan2.2-Distill-Models](https://huggingface.co/lightx2v/Wan2.2-Distill-Models)
- **HunyuanVideo-1.5 Quantized Models**: Download from [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models)
- `hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors` is the quantized weight for the text encoder
**Usage Examples:**
```python
# HunyuanVideo-1.5 Quantization Example
pipe.enable_quantize(
quant_scheme='fp8-sgl',
dit_quantized=True,
dit_quantized_ckpt="/path/to/hy15_720p_i2v_fp8_e4m3_lightx2v.safetensors",
text_encoder_quantized=True,
image_encoder_quantized=False,
text_encoder_quantized_ckpt="/path/to/hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors",
)
# Wan2.1 Quantization Example
pipe.enable_quantize(
dit_quantized=True,
dit_quantized_ckpt="/path/to/wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_4step.safetensors",
)
# Wan2.2 Quantization Example
pipe.enable_quantize(
dit_quantized=True,
low_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors",
high_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step_1030.safetensors",
)
```
**Quantization Scheme Reference:** For detailed information, please refer to the [Quantization Documentation](../docs/EN/source/method_tutorials/quantization.md)
### Parallel Inference
Supports multi-GPU parallel inference. Requires running with `torchrun`:
```python
pipe.enable_parallel(
seq_p_size=4, # Sequence parallel size
seq_p_attn_type="ulysses", # Sequence parallel attention type
)
```
**Running Method:**
```bash
torchrun --nproc_per_node=4 your_script.py
```
### Feature Caching
You can specify the cache method as Mag or Tea, using MagCache and TeaCache methods:
```python
pipe.enable_cache(
cache_method='Tea', # Cache method: 'Tea' or 'Mag'
coefficients=[-3.08907507e+04, 1.67786188e+04, -3.19178643e+03,
2.60740519e+02, -8.19205881e+00, 1.07913775e-01], # Coefficients
teacache_thresh=0.15, # TeaCache threshold
)
```
**Coefficient Reference:** Refer to configuration files in `configs/caching` or `configs/hunyuan_video_15/cache` directories
### Lightweight VAE
Using lightweight VAE can accelerate decoding and reduce memory usage.
```python
pipe.enable_lightvae(
use_lightvae=False, # Whether to use LightVAE
use_tae=False, # Whether to use LightTAE
vae_path=None, # Path to LightVAE or LightTAE
)
```
**Support Status:**
- **LightVAE**: Currently only supports wan2.1, wan2.2 moe
- **LightTAE**: Currently only supports wan2.1, wan2.2-ti2v, wan2.2 moe, HunyuanVideo-1.5
**Model Downloads:** Lightweight VAE models can be downloaded from [Autoencoders](https://huggingface.co/lightx2v/Autoencoders)
- LightVAE for Wan-2.1: [lightvaew2_1.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lightvaew2_1.safetensors)
- LightTAE for Wan-2.1: [lighttaew2_1.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaew2_1.safetensors)
- LightTAE for Wan-2.2-ti2v: [lighttaew2_2.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaew2_2.safetensors)
- LightTAE for HunyuanVideo-1.5: [lighttaehy1_5.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors)
**Usage Example:**
```python
# Using LightTAE for HunyuanVideo-1.5
pipe.enable_lightvae(
use_tae=True,
tae_path="/path/to/lighttaehy1_5.safetensors",
use_lightvae=False,
vae_path=None
)
```
## 📚 More Resources
- [Full Documentation](https://lightx2v-en.readthedocs.io/en/latest/)
- [GitHub Repository](https://github.com/ModelTC/LightX2V)
- [HuggingFace Model Hub](https://huggingface.co/lightx2v)
# LightX2V 使用示例
本文档介绍如何使用 LightX2V 进行视频生成,包括基础使用和进阶配置。
## 📋 目录
- [环境安装](#环境安装)
- [基础运行示例](#基础运行示例)
- [模型路径配置](#模型路径配置)
- [创建生成器](#创建生成器)
- [进阶配置](#进阶配置)
- [参数卸载 (Offload)](#参数卸载-offload)
- [模型量化 (Quantization)](#模型量化-quantization)
- [并行推理 (Parallel Inference)](#并行推理-parallel-inference)
- [特征缓存 (Cache)](#特征缓存-cache)
- [轻量 VAE (Light VAE)](#轻量-vae-light-vae)
## 🔧 环境安装
请参考主项目的[快速入门文档](../docs/ZH_CN/source/getting_started/quickstart.md)进行环境安装。
## 🚀 基础运行示例
最小化代码示例可参考 `examples/wan_t2v.py`
```python
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480,
width=832,
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Your prompt here"
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
## 📁 模型路径配置
### 基础配置
将模型路径传入 `LightX2VPipeline`
```python
pipe = LightX2VPipeline(
image_path="/path/to/img_0.jpg", # I2V 任务需要
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe", # 对于 wan2.1,使用 "wan2.1"
task="i2v",
)
```
### 多版本模型权重指定
`model_path` 目录下存在多个不同版本的 bf16 精度 DIT 模型 safetensors 文件时,需要使用以下参数指定具体使用哪个权重:
- **`dit_original_ckpt`**: 用于指定 wan2.1 和 hunyuan15 等模型的原始 DIT 权重路径
- **`low_noise_original_ckpt`**: 用于指定 wan2.2 模型的低噪声分支权重路径
- **`high_noise_original_ckpt`**: 用于指定 wan2.2 模型的高噪声分支权重路径
**使用示例:**
```python
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe",
task="i2v",
low_noise_original_ckpt="/path/to/low_noise_model.safetensors",
high_noise_original_ckpt="/path/to/high_noise_model.safetensors",
)
```
## 🎛️ 创建生成器
### 从配置文件加载
生成器可以从 JSON 配置文件直接加载,配置文件位于 `configs` 目录:
```python
pipe.create_generator(config_json="../configs/wan/wan_t2v.json")
```
### 手动创建生成器
也可以手动创建生成器,并配置多个参数:
```python
pipe.create_generator(
attn_mode="flash_attn2", # 可选: flash_attn2, flash_attn3, sage_attn2, sage_attn3 (B架构显卡适用)
infer_steps=50, # 推理步数
num_frames=81, # 视频帧数
height=480, # 视频高度
width=832, # 视频宽度
guidance_scale=5.0, # CFG引导强度 (=1时弃用CFG)
sample_shift=5.0, # 采样偏移
fps=16, # 帧率
aspect_ratio="16:9", # 宽高比
boundary=0.900, # 边界值
boundary_step_index=2, # 边界步索引
denoising_step_list=[1000, 750, 500, 250], # 去噪步列表
)
```
**参数说明:**
- **分辨率**: 通过 `height``width` 指定
- **CFG**: 通过 `guidance_scale` 指定(设置为 1 时禁用 CFG)
- **FPS**: 通过 `fps` 指定帧率
- **视频长度**: 通过 `num_frames` 指定帧数
- **推理步数**: 通过 `infer_steps` 指定
- **采样偏移**: 通过 `sample_shift` 指定
- **注意力模式**: 通过 `attn_mode` 指定,可选 `flash_attn2`, `flash_attn3`, `sage_attn2`, `sage_attn3`(B架构显卡适用)
## ⚙️ 进阶配置
**⚠️ 重要提示:手动创建生成器时,可以配置一些进阶选项,所有进阶配置必须在 `create_generator()` 之前指定,否则会失效!**
### 参数卸载 (Offload)
显著降低显存占用,几乎不影响推理速度,适用于 RTX 30/40/50 系列显卡。
```python
pipe.enable_offload(
cpu_offload=True, # 启用 CPU 卸载
offload_granularity="block", # 卸载粒度: "block" 或 "phase"
text_encoder_offload=False, # 文本编码器是否卸载
image_encoder_offload=False, # 图像编码器是否卸载
vae_offload=False, # VAE 是否卸载
)
```
**说明:**
- 对于 Wan 模型,`offload_granularity` 支持 `"block"``"phase"`
- 对于 HunyuanVideo-1.5,目前只支持 `"block"`
### 模型量化 (Quantization)
量化可以显著降低显存占用并加速推理。
```python
pipe.enable_quantize(
dit_quantized=False, # 是否使用量化的 DIT 模型
text_encoder_quantized=False, # 是否使用量化的文本编码器
image_encoder_quantized=False, # 是否使用量化的图像编码器
dit_quantized_ckpt=None, # DIT 量化权重路径(当 model_path 下没有量化权重或存在多个权重时需要指定)
low_noise_quantized_ckpt=None, # Wan2.2 低噪声分支量化权重路径
high_noise_quantized_ckpt=None, # Wan2.2 高噪声分支量化权重路径
text_encoder_quantized_ckpt=None, # 文本编码器量化权重路径(当 model_path 下没有量化权重或存在多个权重时需要指定)
image_encoder_quantized_ckpt=None, # 图像编码器量化权重路径(当 model_path 下没有量化权重或存在多个权重时需要指定)
quant_scheme="fp8-sgl", # 量化方案
)
```
**参数说明:**
- **`dit_quantized_ckpt`**: 当 `model_path` 目录下没有量化权重,或存在多个权重文件时,需要指定具体的 DIT 量化权重路径
- **`text_encoder_quantized_ckpt`****`image_encoder_quantized_ckpt`**: 类似地,用于指定编码器的量化权重路径
- **`low_noise_quantized_ckpt`****`high_noise_quantized_ckpt`**: 用于指定 Wan2.2 模型的双分支量化权重
**量化模型下载:**
- **Wan-2.1 量化模型**: 从 [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Wan2.1-Distill-Models) 下载
- **Wan-2.2 量化模型**: 从 [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Wan2.2-Distill-Models) 下载
- **HunyuanVideo-1.5 量化模型**: 从 [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models) 下载
- `hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors` 是文本编码器的量化权重
**使用示例:**
```python
# HunyuanVideo-1.5 量化示例
pipe.enable_quantize(
quant_scheme='fp8-sgl',
dit_quantized=True,
dit_quantized_ckpt="/path/to/hy15_720p_i2v_fp8_e4m3_lightx2v.safetensors",
text_encoder_quantized=True,
image_encoder_quantized=False,
text_encoder_quantized_ckpt="/path/to/hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors",
)
# Wan2.1 量化示例
pipe.enable_quantize(
dit_quantized=True,
dit_quantized_ckpt="/path/to/wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_4step.safetensors",
)
# Wan2.2 量化示例
pipe.enable_quantize(
dit_quantized=True,
low_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors",
high_noise_quantized_ckpt="/path/to/wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step_1030.safetensors",
)
```
**量化方案参考:** 详细说明请参考 [量化文档](../docs/ZH_CN/source/method_tutorials/quantization.md)
### 并行推理 (Parallel Inference)
支持多 GPU 并行推理,需要使用 `torchrun` 运行:
```python
pipe.enable_parallel(
seq_p_size=4, # 序列并行大小
seq_p_attn_type="ulysses", # 序列并行注意力类型
)
```
**运行方式:**
```bash
torchrun --nproc_per_node=4 your_script.py
```
### 特征缓存 (Cache)
可以指定缓存方法为 Mag 或 Tea,使用 MagCache 和 TeaCache 方法:
```python
pipe.enable_cache(
cache_method='Tea', # 缓存方法: 'Tea' 或 'Mag'
coefficients=[-3.08907507e+04, 1.67786188e+04, -3.19178643e+03,
2.60740519e+02, -8.19205881e+00, 1.07913775e-01], # 系数
teacache_thresh=0.15, # TeaCache 阈值
)
```
**系数参考:** 可参考 `configs/caching``configs/hunyuan_video_15/cache` 目录下的配置文件
### 轻量 VAE (Light VAE)
使用轻量 VAE 可以加速解码并降低显存占用。
```python
pipe.enable_lightvae(
use_lightvae=False, # 是否使用 LightVAE
use_tae=False, # 是否使用 LightTAE
vae_path=None, # LightVAE 或 LightTAE 的路径
)
```
**支持情况:**
- **LightVAE**: 目前只支持 wan2.1、wan2.2 moe
- **LightTAE**: 目前只支持 wan2.1、wan2.2-ti2v、wan2.2 moe、HunyuanVideo-1.5
**模型下载:** 轻量 VAE 模型可从 [Autoencoders](https://huggingface.co/lightx2v/Autoencoders) 下载
- Wan-2.1 的 LightVAE: [lightvaew2_1.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lightvaew2_1.safetensors)
- Wan-2.1 的 LightTAE: [lighttaew2_1.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaew2_1.safetensors)
- Wan-2.2-ti2v 的 LightTAE: [lighttaew2_2.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaew2_2.safetensors)
- HunyuanVideo-1.5 的 LightTAE: [lighttaehy1_5.safetensors](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors)
**使用示例:**
```python
# 使用 HunyuanVideo-1.5 的 LightTAE
pipe.enable_lightvae(
use_tae=True,
tae_path="/path/to/lighttaehy1_5.safetensors",
use_lightvae=False,
vae_path=None
)
```
## 📚 更多资源
- [完整文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/)
- [GitHub 仓库](https://github.com/ModelTC/LightX2V)
- [HuggingFace 模型库](https://huggingface.co/lightx2v)
"""
HunyuanVideo-1.5 image-to-video generation example with quantization.
This example demonstrates how to use LightX2V with HunyuanVideo-1.5 model for I2V generation,
including quantized model usage for reduced memory consumption.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for HunyuanVideo-1.5 I2V task
pipe = LightX2VPipeline(
image_path="/path/to/assets/inputs/imgs/img_0.jpg",
model_path="/path/to/ckpts/hunyuanvideo-1.5/",
model_cls="hunyuan_video_1.5",
transformer_model_name="720p_i2v",
task="i2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(config_json="../configs/hunyuan_video_15/hunyuan_video_i2v_720p.json")
# Enable offloading to significantly reduce VRAM usage with minimal speed impact
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block", # For HunyuanVideo-1.5, only "block" is supported
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Enable quantization for reduced memory usage
# Quantized models can be downloaded from: https://huggingface.co/lightx2v/Hy1.5-Quantized-Models
pipe.enable_quantize(
quant_scheme="fp8-sgl",
dit_quantized=True,
dit_quantized_ckpt="/path/to/hy15_720p_i2v_fp8_e4m3_lightx2v.safetensors",
text_encoder_quantized=True,
image_encoder_quantized=False,
text_encoder_quantized_ckpt="/path/to/hy15_qwen25vl_llm_encoder_fp8_e4m3_lightx2v.safetensors",
)
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
num_frames=121,
guidance_scale=6.0,
sample_shift=7.0,
fps=24,
)
# Generation parameters
seed = 42
prompt = "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
negative_prompt = ""
save_result_path = "/path/to/save_results/output2.mp4"
# Generate video
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
HunyuanVideo-1.5 text-to-video generation example.
This example demonstrates how to use LightX2V with HunyuanVideo-1.5 model for T2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for HunyuanVideo-1.5
pipe = LightX2VPipeline(
model_path="/path/to/ckpts/hunyuanvideo-1.5/",
model_cls="hunyuan_video_1.5",
transformer_model_name="720p_t2v",
task="t2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(config_json="../configs/hunyuan_video_15/hunyuan_video_t2v_720p.json")
# Enable offloading to significantly reduce VRAM usage with minimal speed impact
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block", # For HunyuanVideo-1.5, only "block" is supported
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Use lighttae
pipe.enable_lightvae(
use_tae=True,
tae_path="/data/nvme0/gushiqiao/models/hy_tae_models/model8.pth",
use_lightvae=False,
vae_path=None,
)
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
num_frames=121,
guidance_scale=6.0,
sample_shift=9.0,
aspect_ratio="16:9",
fps=24,
)
# Generation parameters
seed = 123
prompt = "A close-up shot captures a scene on a polished, light-colored granite kitchen counter, illuminated by soft natural light from an unseen window. Initially, the frame focuses on a tall, clear glass filled with golden, translucent apple juice standing next to a single, shiny red apple with a green leaf still attached to its stem. The camera moves horizontally to the right. As the shot progresses, a white ceramic plate smoothly enters the frame, revealing a fresh arrangement of about seven or eight more apples, a mix of vibrant reds and greens, piled neatly upon it. A shallow depth of field keeps the focus sharply on the fruit and glass, while the kitchen backsplash in the background remains softly blurred. The scene is in a realistic style."
negative_prompt = ""
save_result_path = "/path/to/save_results/output.mp4"
# Generate video
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.2 animate video generation example.
This example demonstrates how to use LightX2V with Wan2.2 model for animate video generation.
First, run preprocessing:
1. Set up environment: pip install -r ../requirements_animate.txt
2. For animate mode:
python ../tools/preprocess/preprocess_data.py \
--ckpt_path /path/to/Wan2.1-FLF2V-14B-720P/process_checkpoint \
--video_path /path/to/video \
--refer_path /path/to/ref_img \
--save_path ../save_results/animate/process_results \
--resolution_area 1280 720 \
--retarget_flag
3. For replace mode:
python ../tools/preprocess/preprocess_data.py \
--ckpt_path /path/to/Wan2.1-FLF2V-14B-720P/process_checkpoint \
--video_path /path/to/video \
--refer_path /path/to/ref_img \
--save_path ../save_results/replace/process_results \
--resolution_area 1280 720 \
--iterations 3 \
--k 7 \
--w_len 1 \
--h_len 1 \
--replace_flag
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for animate task
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-FLF2V-14B-720P",
src_pose_path="../save_results/animate/process_results/src_pose.mp4",
src_face_path="../save_results/animate/process_results/src_face.mp4",
src_ref_images="../save_results/animate/process_results/src_ref.png",
model_cls="wan2.2_animate",
task="animate",
)
pipe.replace_flag = True # Set to True for replace mode, False for animate mode
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan/wan_animate_replace.json"
# )
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=20,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=77,
guidance_scale=1,
sample_shift=5.0,
fps=30,
)
seed = 42
prompt = "视频中的人在做动作"
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.1 first-last-frame-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.1 model for FLF2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for FLF2V task
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-FLF2V-14B-720P",
image_path="../assets/inputs/imgs/flf2v_input_first_frame-fs8.png",
last_frame_path="../assets/inputs/imgs/flf2v_input_last_frame-fs8.png",
model_cls="wan2.1",
task="flf2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan/wan_flf2v.json"
# )
# Optional: enable offloading to significantly reduce VRAM usage
# Suitable for RTX 30/40/50 consumer GPUs
# pipe.enable_offload(
# cpu_offload=True,
# offload_granularity="block",
# text_encoder_offload=True,
# image_encoder_offload=False,
# vae_offload=False,
# )
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=40,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=5,
sample_shift=5.0,
)
seed = 42
prompt = "CG animation style, a small blue bird takes off from the ground, flapping its wings. The bird’s feathers are delicate, with a unique pattern on its chest. The background shows a blue sky with white clouds under bright sunshine. The camera follows the bird upward, capturing its flight and the vastness of the sky from a close-up, low-angle perspective."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.2 image-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.2 model for I2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for Wan2.2 I2V task
# For wan2.1, use model_cls="wan2.1"
pipe = LightX2VPipeline(
image_path="/path/to/img_0.jpg",
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe",
task="i2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan22/wan_moe_i2v.json"
# )
# Enable offloading to significantly reduce VRAM usage with minimal speed impact
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block", # For Wan models, supports both "block" and "phase"
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Create generator manually with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=40,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=[3.5, 3.5], # For wan2.1, guidance_scale is a scalar (e.g., 5.0)
sample_shift=5.0,
)
# Generation parameters
seed = 42
prompt = "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
# Generate video
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.2 distilled model image-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.2 distilled model for I2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for Wan2.2 distilled I2V task
# For wan2.1, use model_cls="wan2.1_distill"
pipe = LightX2VPipeline(
image_path="/path/to/img_0.jpg",
model_path="/path/to/wan2.2/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe_distill",
task="i2v",
# Distilled weights: For wan2.1, only need to specify dit_original_ckpt="/path/to/wan2.1_i2v_720p_lightx2v_4step.safetensors"
low_noise_original_ckpt="/path/to/wan2.2_i2v_A14b_low_noise_lightx2v_4step.safetensors",
high_noise_original_ckpt="/path/to/wan2.2_i2v_A14b_high_noise_lightx2v_4step_1030.safetensors",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan22/wan_moe_i2v_distill.json"
# )
# Enable offloading to significantly reduce VRAM usage
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block",
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Create generator manually with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=4,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=1,
sample_shift=5.0,
)
seed = 42
prompt = "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.2 distilled model with LoRA image-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.2 distilled model and LoRA for I2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for Wan2.2 distilled I2V task with LoRA
# For wan2.1, use model_cls="wan2.1_distill"
pipe = LightX2VPipeline(
image_path="/path/to/img_0.jpg",
model_path="/path/to/wan2.2/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe_distill",
task="i2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan22/wan_moe_i2v_distill_with_lora.json"
# )
# Enable offloading to significantly reduce VRAM usage
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block",
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Load distilled LoRA weights
pipe.enable_lora(
[
{"name": "high_noise_model", "path": "/data/nvme0/gushiqiao/models/old_loras/loras/wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022.safetensors", "strength": 1.0},
{"name": "low_noise_model", "path": "/data/nvme0/gushiqiao/models/old_loras/loras/wan2.2_i2v_A14b_low_noise_lora_rank64_lightx2v_4step_1022.safetensors", "strength": 1.0},
]
)
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=4,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=1,
sample_shift=5.0,
)
seed = 42
prompt = "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.1 text-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.1 model for T2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for Wan2.1 T2V task
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-T2V-14B",
model_cls="wan2.1",
task="t2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(config_json="../configs/wan/wan_t2v.json")
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=5.0,
sample_shift=5.0,
)
seed = 42
prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
"""
Wan2.1 VACE (Video Animate Character Exchange) generation example.
This example demonstrates how to use LightX2V with Wan2.1 VACE model for character exchange in videos.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for VACE task
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.1-VACE-1.3B",
src_ref_images="../assets/inputs/imgs/girl.png,../assets/inputs/imgs/snake.png",
model_cls="wan2.1_vace",
task="vace",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="../configs/wan/wan_vace.json"
# )
# Optional: enable offloading to significantly reduce VRAM usage
# Suitable for RTX 30/40/50 consumer GPUs
# pipe.enable_offload(
# cpu_offload=True,
# offload_granularity="block",
# text_encoder_offload=True,
# image_encoder_offload=False,
# vae_offload=False,
# )
# Create generator with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=40,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=5,
sample_shift=16,
)
seed = 42
prompt = "在一个欢乐而充满节日气氛的场景中,穿着鲜艳红色春服的小女孩正与她的可爱卡通蛇嬉戏。她的春服上绣着金色吉祥图案,散发着喜庆的气息,脸上洋溢着灿烂的笑容。蛇身呈现出亮眼的绿色,形状圆润,宽大的眼睛让它显得既友善又幽默。小女孩欢快地用手轻轻抚摸着蛇的头部,共同享受着这温馨的时刻。周围五彩斑斓的灯笼和彩带装饰着环境,阳光透过洒在她们身上,营造出一个充满友爱与幸福的新年氛围。"
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
save_result_path = "/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
Metadata-Version: 2.4
Name: lightx2v
Version: 0.1.0
Summary: LightX2V: Light Video Generation Inference Framework
Author: LightX2V Contributors
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/ModelTC/LightX2V
Project-URL: Documentation, https://lightx2v-en.readthedocs.io/en/latest/
Project-URL: Repository, https://github.com/ModelTC/LightX2V
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Multimedia :: Video
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: torch<=2.8.0
Requires-Dist: torchvision<=0.23.0
Requires-Dist: torchaudio<=2.8.0
Requires-Dist: diffusers
Requires-Dist: transformers
Requires-Dist: tokenizers
Requires-Dist: tqdm
Requires-Dist: accelerate
Requires-Dist: safetensors
Requires-Dist: opencv-python
Requires-Dist: imageio
Requires-Dist: imageio-ffmpeg
Requires-Dist: einops
Requires-Dist: loguru
Requires-Dist: qtorch
Requires-Dist: ftfy
Requires-Dist: gradio
Requires-Dist: aiohttp
Requires-Dist: pydantic
Requires-Dist: prometheus-client
Requires-Dist: gguf
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: PyJWT
Requires-Dist: requests
Requires-Dist: aio-pika
Requires-Dist: asyncpg>=0.27.0
Requires-Dist: aioboto3>=12.0.0
Requires-Dist: alibabacloud_dypnsapi20170525==1.2.2
Requires-Dist: redis==6.4.0
Requires-Dist: tos
Requires-Dist: decord
Requires-Dist: av
<div align="center" style="font-family: charter;">
<h1>⚡️ LightX2V:<br> Light Video Generation Inference Framework</h1>
<img alt="logo" src="assets/img_lightx2v.png" width=75%></img>
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/ModelTC/lightx2v)
[![Doc](https://img.shields.io/badge/docs-English-99cc2)](https://lightx2v-en.readthedocs.io/en/latest)
[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest)
[![Papers](https://img.shields.io/badge/论文集-中文-99cc2)](https://lightx2v-papers-zhcn.readthedocs.io/zh-cn/latest)
[![Docker](https://img.shields.io/badge/Docker-2496ED?style=flat&logo=docker&logoColor=white)](https://hub.docker.com/r/lightx2v/lightx2v/tags)
**\[ English | [中文](README_zh.md) \]**
</div>
--------------------------------------------------------------------------------
**LightX2V** is an advanced lightweight video generation inference framework engineered to deliver efficient, high-performance video synthesis solutions. This unified platform integrates multiple state-of-the-art video generation techniques, supporting diverse generation tasks including text-to-video (T2V) and image-to-video (I2V). **X2V represents the transformation of different input modalities (X, such as text or images) into video output (V)**.
## :fire: Latest News
- **November 21, 2025:** 🚀 We support the [HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5) video generation model since Day 0. With the same number of GPUs, LightX2V can achieve a speed improvement of over 2 times and supports deployment on GPUs with lower memory (such as the 24GB RTX 4090). It also supports CFG/Ulysses parallelism, efficient offloading, TeaCache/MagCache technologies, and more. We will soon update our models on our [HuggingFace page](https://huggingface.co/lightx2v), including quantization, step distillation, VAE distillation, and other related models. Refer to [this](https://github.com/ModelTC/LightX2V/tree/main/scripts/hunyuan_video_15) for usage tutorials.
## 💡 Quick Start
For comprehensive usage instructions, please refer to our documentation: **[English Docs](https://lightx2v-en.readthedocs.io/en/latest/) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/)**
### Installation from Git
```bash
pip install -v git+https://github.com/ModelTC/LightX2V.git
```
### Building from Source
```bash
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v . # pip install -v .
```
### (Optional) Install Attention Operators
For attention operators installation, please refer to our documentation: **[English Docs](https://lightx2v-en.readthedocs.io/en/latest/getting_started/quickstart.html#step-4-install-attention-operators) | [中文文档](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html#id9)**
### Quick Start
```python
# examples/hunyuan_video/hunyuan_t2v.py
from lightx2v import LightX2VPipeline
pipe = LightX2VPipeline(
model_path="/path/to/ckpts/hunyuanvideo-1.5/",
model_cls="hunyuan_video_1.5",
transformer_model_name="720p_t2v",
task="t2v",
)
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=50,
num_frames=121,
guidance_scale=6.0,
sample_shift=9.0,
aspect_ratio="16:9",
fps=24,
)
seed = 123
prompt = "A close-up shot captures a scene on a polished, light-colored granite kitchen counter, illuminated by soft natural light from an unseen window."
negative_prompt = ""
save_result_path="/path/to/save_results/output.mp4"
pipe.generate(
seed=seed,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)
```
## 🤖 Supported Model Ecosystem
### Official Open-Source Models
- ✅ [HunyuanVideo-1.5](https://huggingface.co/tencent/HunyuanVideo-1.5)
- ✅ [Wan2.1 & Wan2.2](https://huggingface.co/Wan-AI/)
- ✅ [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image)
- ✅ [Qwen-Image-Edit](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit)
- ✅ [Qwen-Image-Edit-2509](https://huggingface.co/Qwen/Qwen-Image-Edit-2509)
### Quantized and Distilled Models/LoRAs (**🚀 Recommended: 4-step inference**)
- ✅ [Hy1.5-Quantized-Models](https://huggingface.co/lightx2v/Hy1.5-Quantized-Models)
- ✅ [Wan2.1-Distill-Models](https://huggingface.co/lightx2v/Wan2.1-Distill-Models)
- ✅ [Wan2.2-Distill-Models](https://huggingface.co/lightx2v/Wan2.2-Distill-Models)
- ✅ [Wan2.1-Distill-Loras](https://huggingface.co/lightx2v/Wan2.1-Distill-Loras)
- ✅ [Wan2.2-Distill-Loras](https://huggingface.co/lightx2v/Wan2.2-Distill-Loras)
### Lightweight Autoencoder Models (**🚀 Recommended: fast inference & low memory usage**)
- ✅ [Autoencoders](https://huggingface.co/lightx2v/Autoencoders)
### Autoregressive Models
- ✅ [Wan2.1-T2V-CausVid](https://huggingface.co/lightx2v/Wan2.1-T2V-14B-CausVid)
- ✅ [Self-Forcing](https://github.com/guandeh17/Self-Forcing)
- ✅ [Matrix-Game-2.0](https://huggingface.co/Skywork/Matrix-Game-2.0)
🔔 Follow our [HuggingFace page](https://huggingface.co/lightx2v) for the latest model releases from our team.
💡 Refer to the [Model Structure Documentation](https://lightx2v-en.readthedocs.io/en/latest/getting_started/model_structure.html) to quickly get started with LightX2V
## 🚀 Frontend Interfaces
We provide multiple frontend interface deployment options:
- **🎨 Gradio Interface**: Clean and user-friendly web interface, perfect for quick experience and prototyping
- 📖 [Gradio Deployment Guide](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_gradio.html)
- **🎯 ComfyUI Interface**: Powerful node-based workflow interface, supporting complex video generation tasks
- 📖 [ComfyUI Deployment Guide](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_comfyui.html)
- **🚀 Windows One-Click Deployment**: Convenient deployment solution designed for Windows users, featuring automatic environment configuration and intelligent parameter optimization
- 📖 [Windows One-Click Deployment Guide](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_local_windows.html)
**💡 Recommended Solutions**:
- **First-time Users**: We recommend the Windows one-click deployment solution
- **Advanced Users**: We recommend the ComfyUI interface for more customization options
- **Quick Experience**: The Gradio interface provides the most intuitive operation experience
## 🚀 Core Features
### 🎯 **Ultimate Performance Optimization**
- **🔥 SOTA Inference Speed**: Achieve **~20x** acceleration via step distillation and system optimization (single GPU)
- **⚡️ Revolutionary 4-Step Distillation**: Compress original 40-50 step inference to just 4 steps without CFG requirements
- **🛠️ Advanced Operator Support**: Integrated with cutting-edge operators including [Sage Attention](https://github.com/thu-ml/SageAttention), [Flash Attention](https://github.com/Dao-AILab/flash-attention), [Radial Attention](https://github.com/mit-han-lab/radial-attention), [q8-kernel](https://github.com/KONAKONA666/q8_kernels), [sgl-kernel](https://github.com/sgl-project/sglang/tree/main/sgl-kernel), [vllm](https://github.com/vllm-project/vllm)
### 💾 **Resource-Efficient Deployment**
- **💡 Breaking Hardware Barriers**: Run 14B models for 480P/720P video generation with only **8GB VRAM + 16GB RAM**
- **🔧 Intelligent Parameter Offloading**: Advanced disk-CPU-GPU three-tier offloading architecture with phase/block-level granular management
- **⚙️ Comprehensive Quantization**: Support for `w8a8-int8`, `w8a8-fp8`, `w4a4-nvfp4` and other quantization strategies
### 🎨 **Rich Feature Ecosystem**
- **📈 Smart Feature Caching**: Intelligent caching mechanisms to eliminate redundant computations
- **🔄 Parallel Inference**: Multi-GPU parallel processing for enhanced performance
- **📱 Flexible Deployment Options**: Support for Gradio, service deployment, ComfyUI and other deployment methods
- **🎛️ Dynamic Resolution Inference**: Adaptive resolution adjustment for optimal generation quality
- **🎞️ Video Frame Interpolation**: RIFE-based frame interpolation for smooth frame rate enhancement
## 🏆 Performance Benchmarks
For detailed performance metrics and comparisons, please refer to our [benchmark documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/benchmark_source.md).
[Detailed Service Deployment Guide →](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_service.html)
## 📚 Technical Documentation
### 📖 **Method Tutorials**
- [Model Quantization](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/quantization.html) - Comprehensive guide to quantization strategies
- [Feature Caching](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/cache.html) - Intelligent caching mechanisms
- [Attention Mechanisms](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/attention.html) - State-of-the-art attention operators
- [Parameter Offloading](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/offload.html) - Three-tier storage architecture
- [Parallel Inference](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/parallel.html) - Multi-GPU acceleration strategies
- [Changing Resolution Inference](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/changing_resolution.html) - U-shaped resolution strategy
- [Step Distillation](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/step_distill.html) - 4-step inference technology
- [Video Frame Interpolation](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/video_frame_interpolation.html) - Base on the RIFE technology
### 🛠️ **Deployment Guides**
- [Low-Resource Deployment](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/for_low_resource.html) - Optimized 8GB VRAM solutions
- [Low-Latency Deployment](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/for_low_latency.html) - Ultra-fast inference optimization
- [Gradio Deployment](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_gradio.html) - Web interface setup
- [Service Deployment](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/deploy_service.html) - Production API service deployment
- [Lora Model Deployment](https://lightx2v-en.readthedocs.io/en/latest/deploy_guides/lora_deploy.html) - Flexible Lora deployment
## 🧾 Contributing Guidelines
We maintain code quality through automated pre-commit hooks to ensure consistent formatting across the project.
> [!TIP]
> **Setup Instructions:**
>
> 1. Install required dependencies:
> ```shell
> pip install ruff pre-commit
> ```
>
> 2. Run before committing:
> ```shell
> pre-commit run --all-files
> ```
We appreciate your contributions to making LightX2V better!
## 🤝 Acknowledgments
We extend our gratitude to all the model repositories and research communities that inspired and contributed to the development of LightX2V. This framework builds upon the collective efforts of the open-source community.
## 🌟 Star History
[![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/lightx2v&type=Timeline)](https://star-history.com/#ModelTC/lightx2v&Timeline)
## ✏️ Citation
If you find LightX2V useful in your research, please consider citing our work:
```bibtex
@misc{lightx2v,
author = {LightX2V Contributors},
title = {LightX2V: Light Video Generation Inference Framework},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ModelTC/lightx2v}},
}
```
## 📞 Contact & Support
For questions, suggestions, or support, please feel free to reach out through:
- 🐛 [GitHub Issues](https://github.com/ModelTC/lightx2v/issues) - Bug reports and feature requests
- 💬 [GitHub Discussions](https://github.com/ModelTC/lightx2v/discussions) - Community discussions and Q&A
---
<div align="center">
Built with ❤️ by the LightX2V team
</div>
README.md
pyproject.toml
lightx2v/__init__.py
lightx2v/infer.py
lightx2v/pipeline.py
lightx2v.egg-info/PKG-INFO
lightx2v.egg-info/SOURCES.txt
lightx2v.egg-info/dependency_links.txt
lightx2v.egg-info/requires.txt
lightx2v.egg-info/top_level.txt
lightx2v/common/__init__.py
lightx2v/common/modules/__init__.py
lightx2v/common/modules/weight_module.py
lightx2v/common/offload/manager.py
lightx2v/common/ops/__init__.py
lightx2v/common/ops/attn/__init__.py
lightx2v/common/ops/attn/flash_attn.py
lightx2v/common/ops/attn/nbhd_attn.py
lightx2v/common/ops/attn/radial_attn.py
lightx2v/common/ops/attn/ring_attn.py
lightx2v/common/ops/attn/sage_attn.py
lightx2v/common/ops/attn/spassage_attn.py
lightx2v/common/ops/attn/svg2_attn.py
lightx2v/common/ops/attn/svg2_attn_utils.py
lightx2v/common/ops/attn/svg_attn.py
lightx2v/common/ops/attn/template.py
lightx2v/common/ops/attn/torch_sdpa.py
lightx2v/common/ops/attn/ulysses_attn.py
lightx2v/common/ops/attn/utils/all2all.py
lightx2v/common/ops/attn/utils/ring_comm.py
lightx2v/common/ops/conv/__init__.py
lightx2v/common/ops/conv/conv2d.py
lightx2v/common/ops/conv/conv3d.py
lightx2v/common/ops/embedding/__init__.py
lightx2v/common/ops/embedding/embedding_weight.py
lightx2v/common/ops/mm/__init__.py
lightx2v/common/ops/mm/mm_weight.py
lightx2v/common/ops/norm/__init__.py
lightx2v/common/ops/norm/layer_norm_weight.py
lightx2v/common/ops/norm/rms_norm_weight.py
lightx2v/common/ops/norm/triton_ops.py
lightx2v/common/ops/tensor/__init__.py
lightx2v/common/ops/tensor/tensor.py
lightx2v/common/transformer_infer/transformer_infer.py
lightx2v/deploy/__init__.py
lightx2v/deploy/common/__init__.py
lightx2v/deploy/common/aliyun.py
lightx2v/deploy/common/pipeline.py
lightx2v/deploy/common/utils.py
lightx2v/deploy/common/va_reader.py
lightx2v/deploy/common/va_recorder.py
lightx2v/deploy/common/va_recorder_x264.py
lightx2v/deploy/common/volcengine_tts.py
lightx2v/deploy/data_manager/__init__.py
lightx2v/deploy/data_manager/local_data_manager.py
lightx2v/deploy/data_manager/s3_data_manager.py
lightx2v/deploy/queue_manager/__init__.py
lightx2v/deploy/queue_manager/local_queue_manager.py
lightx2v/deploy/queue_manager/rabbitmq_queue_manager.py
lightx2v/deploy/server/__init__.py
lightx2v/deploy/server/__main__.py
lightx2v/deploy/server/auth.py
lightx2v/deploy/server/metrics.py
lightx2v/deploy/server/monitor.py
lightx2v/deploy/server/redis_client.py
lightx2v/deploy/server/redis_monitor.py
lightx2v/deploy/task_manager/__init__.py
lightx2v/deploy/task_manager/local_task_manager.py
lightx2v/deploy/task_manager/sql_task_manager.py
lightx2v/deploy/worker/__init__.py
lightx2v/deploy/worker/__main__.py
lightx2v/deploy/worker/hub.py
lightx2v/models/__init__.py
lightx2v/models/input_encoders/__init__.py
lightx2v/models/input_encoders/hf/__init__.py
lightx2v/models/input_encoders/hf/q_linear.py
lightx2v/models/input_encoders/hf/animate/__init__.py
lightx2v/models/input_encoders/hf/animate/face_encoder.py
lightx2v/models/input_encoders/hf/animate/motion_encoder.py
lightx2v/models/input_encoders/hf/hunyuan15/byt5/__init__.py
lightx2v/models/input_encoders/hf/hunyuan15/byt5/format_prompt.py
lightx2v/models/input_encoders/hf/hunyuan15/byt5/model.py
lightx2v/models/input_encoders/hf/hunyuan15/qwen25/__init__.py
lightx2v/models/input_encoders/hf/hunyuan15/qwen25/model.py
lightx2v/models/input_encoders/hf/hunyuan15/siglip/__init__.py
lightx2v/models/input_encoders/hf/hunyuan15/siglip/model.py
lightx2v/models/input_encoders/hf/qwen25/qwen25_vlforconditionalgeneration.py
lightx2v/models/input_encoders/hf/seko_audio/audio_adapter.py
lightx2v/models/input_encoders/hf/seko_audio/audio_encoder.py
lightx2v/models/input_encoders/hf/vace/vace_processor.py
lightx2v/models/input_encoders/hf/wan/matrix_game2/__init__.py
lightx2v/models/input_encoders/hf/wan/matrix_game2/clip.py
lightx2v/models/input_encoders/hf/wan/matrix_game2/conditions.py
lightx2v/models/input_encoders/hf/wan/matrix_game2/tokenizers.py
lightx2v/models/input_encoders/hf/wan/t5/__init__.py
lightx2v/models/input_encoders/hf/wan/t5/model.py
lightx2v/models/input_encoders/hf/wan/t5/tokenizer.py
lightx2v/models/input_encoders/hf/wan/xlm_roberta/__init__.py
lightx2v/models/input_encoders/hf/wan/xlm_roberta/model.py
lightx2v/models/networks/__init__.py
lightx2v/models/networks/hunyuan_video/__init__.py
lightx2v/models/networks/hunyuan_video/model.py
lightx2v/models/networks/hunyuan_video/infer/attn_no_pad.py
lightx2v/models/networks/hunyuan_video/infer/module_io.py
lightx2v/models/networks/hunyuan_video/infer/post_infer.py
lightx2v/models/networks/hunyuan_video/infer/pre_infer.py
lightx2v/models/networks/hunyuan_video/infer/transformer_infer.py
lightx2v/models/networks/hunyuan_video/infer/triton_ops.py
lightx2v/models/networks/hunyuan_video/infer/feature_caching/__init__.py
lightx2v/models/networks/hunyuan_video/infer/feature_caching/transformer_infer.py
lightx2v/models/networks/hunyuan_video/infer/offload/__init__.py
lightx2v/models/networks/hunyuan_video/infer/offload/transformer_infer.py
lightx2v/models/networks/hunyuan_video/weights/post_weights.py
lightx2v/models/networks/hunyuan_video/weights/pre_weights.py
lightx2v/models/networks/hunyuan_video/weights/transformer_weights.py
lightx2v/models/networks/qwen_image/model.py
lightx2v/models/networks/qwen_image/infer/post_infer.py
lightx2v/models/networks/qwen_image/infer/pre_infer.py
lightx2v/models/networks/qwen_image/infer/transformer_infer.py
lightx2v/models/networks/qwen_image/infer/offload/__init__.py
lightx2v/models/networks/qwen_image/infer/offload/transformer_infer.py
lightx2v/models/networks/qwen_image/weights/post_weights.py
lightx2v/models/networks/qwen_image/weights/pre_weights.py
lightx2v/models/networks/qwen_image/weights/transformer_weights.py
lightx2v/models/networks/wan/animate_model.py
lightx2v/models/networks/wan/audio_model.py
lightx2v/models/networks/wan/causvid_model.py
lightx2v/models/networks/wan/distill_model.py
lightx2v/models/networks/wan/lora_adapter.py
lightx2v/models/networks/wan/matrix_game2_model.py
lightx2v/models/networks/wan/model.py
lightx2v/models/networks/wan/sf_model.py
lightx2v/models/networks/wan/vace_model.py
lightx2v/models/networks/wan/infer/module_io.py
lightx2v/models/networks/wan/infer/post_infer.py
lightx2v/models/networks/wan/infer/pre_infer.py
lightx2v/models/networks/wan/infer/transformer_infer.py
lightx2v/models/networks/wan/infer/utils.py
lightx2v/models/networks/wan/infer/animate/pre_infer.py
lightx2v/models/networks/wan/infer/animate/transformer_infer.py
lightx2v/models/networks/wan/infer/audio/post_infer.py
lightx2v/models/networks/wan/infer/audio/pre_infer.py
lightx2v/models/networks/wan/infer/audio/transformer_infer.py
lightx2v/models/networks/wan/infer/causvid/__init__.py
lightx2v/models/networks/wan/infer/causvid/transformer_infer.py
lightx2v/models/networks/wan/infer/feature_caching/__init__.py
lightx2v/models/networks/wan/infer/feature_caching/transformer_infer.py
lightx2v/models/networks/wan/infer/matrix_game2/posemb_layers.py
lightx2v/models/networks/wan/infer/matrix_game2/pre_infer.py
lightx2v/models/networks/wan/infer/matrix_game2/transformer_infer.py
lightx2v/models/networks/wan/infer/offload/__init__.py
lightx2v/models/networks/wan/infer/offload/transformer_infer.py
lightx2v/models/networks/wan/infer/self_forcing/__init__.py
lightx2v/models/networks/wan/infer/self_forcing/pre_infer.py
lightx2v/models/networks/wan/infer/self_forcing/transformer_infer.py
lightx2v/models/networks/wan/infer/vace/transformer_infer.py
lightx2v/models/networks/wan/weights/post_weights.py
lightx2v/models/networks/wan/weights/pre_weights.py
lightx2v/models/networks/wan/weights/transformer_weights.py
lightx2v/models/networks/wan/weights/animate/transformer_weights.py
lightx2v/models/networks/wan/weights/audio/transformer_weights.py
lightx2v/models/networks/wan/weights/matrix_game2/pre_weights.py
lightx2v/models/networks/wan/weights/matrix_game2/transformer_weights.py
lightx2v/models/networks/wan/weights/vace/transformer_weights.py
lightx2v/models/runners/__init__.py
lightx2v/models/runners/base_runner.py
lightx2v/models/runners/default_runner.py
lightx2v/models/runners/hunyuan_video/hunyuan_video_15_runner.py
lightx2v/models/runners/qwen_image/qwen_image_runner.py
lightx2v/models/runners/vsr/vsr_wrapper.py
lightx2v/models/runners/vsr/vsr_wrapper_hy15.py
lightx2v/models/runners/vsr/utils/TCDecoder.py
lightx2v/models/runners/vsr/utils/utils.py
lightx2v/models/runners/wan/__init__.py
lightx2v/models/runners/wan/wan_animate_runner.py
lightx2v/models/runners/wan/wan_audio_runner.py
lightx2v/models/runners/wan/wan_distill_runner.py
lightx2v/models/runners/wan/wan_matrix_game2_runner.py
lightx2v/models/runners/wan/wan_runner.py
lightx2v/models/runners/wan/wan_sf_runner.py
lightx2v/models/runners/wan/wan_vace_runner.py
lightx2v/models/schedulers/__init__.py
lightx2v/models/schedulers/scheduler.py
lightx2v/models/schedulers/hunyuan_video/__init__.py
lightx2v/models/schedulers/hunyuan_video/posemb_layers.py
lightx2v/models/schedulers/hunyuan_video/scheduler.py
lightx2v/models/schedulers/hunyuan_video/feature_caching/__init__.py
lightx2v/models/schedulers/hunyuan_video/feature_caching/scheduler.py
lightx2v/models/schedulers/qwen_image/scheduler.py
lightx2v/models/schedulers/wan/scheduler.py
lightx2v/models/schedulers/wan/audio/scheduler.py
lightx2v/models/schedulers/wan/changing_resolution/scheduler.py
lightx2v/models/schedulers/wan/feature_caching/scheduler.py
lightx2v/models/schedulers/wan/self_forcing/scheduler.py
lightx2v/models/schedulers/wan/step_distill/scheduler.py
lightx2v/models/vfi/rife/rife_comfyui_wrapper.py
lightx2v/models/vfi/rife/model/loss.py
lightx2v/models/vfi/rife/model/warplayer.py
lightx2v/models/vfi/rife/model/pytorch_msssim/__init__.py
lightx2v/models/vfi/rife/train_log/IFNet_HDv3.py
lightx2v/models/vfi/rife/train_log/RIFE_HDv3.py
lightx2v/models/vfi/rife/train_log/refine.py
lightx2v/models/video_encoders/__init__.py
lightx2v/models/video_encoders/hf/__init__.py
lightx2v/models/video_encoders/hf/tae.py
lightx2v/models/video_encoders/hf/vid_recon.py
lightx2v/models/video_encoders/hf/hunyuanvideo15/__init__.py
lightx2v/models/video_encoders/hf/hunyuanvideo15/hunyuanvideo_15_vae.py
lightx2v/models/video_encoders/hf/hunyuanvideo15/lighttae_hy15.py
lightx2v/models/video_encoders/hf/qwen_image/__init__.py
lightx2v/models/video_encoders/hf/qwen_image/vae.py
lightx2v/models/video_encoders/hf/wan/__init__.py
lightx2v/models/video_encoders/hf/wan/vae.py
lightx2v/models/video_encoders/hf/wan/vae_2_2.py
lightx2v/models/video_encoders/hf/wan/vae_sf.py
lightx2v/models/video_encoders/hf/wan/vae_tiny.py
lightx2v/server/__init__.py
lightx2v/server/__main__.py
lightx2v/server/api.py
lightx2v/server/audio_utils.py
lightx2v/server/config.py
lightx2v/server/distributed_utils.py
lightx2v/server/image_utils.py
lightx2v/server/main.py
lightx2v/server/run_server.py
lightx2v/server/schema.py
lightx2v/server/service.py
lightx2v/server/task_manager.py
lightx2v/server/metrics/__init__.py
lightx2v/server/metrics/metrics.py
lightx2v/server/metrics/monitor.py
lightx2v/utils/__init__.py
lightx2v/utils/async_io.py
lightx2v/utils/custom_compiler.py
lightx2v/utils/envs.py
lightx2v/utils/generate_task_id.py
lightx2v/utils/global_paras.py
lightx2v/utils/input_info.py
lightx2v/utils/lockable_dict.py
lightx2v/utils/memory_profiler.py
lightx2v/utils/print_atten_score.py
lightx2v/utils/profiler.py
lightx2v/utils/prompt_enhancer.py
lightx2v/utils/quant_utils.py
lightx2v/utils/registry_factory.py
lightx2v/utils/service_utils.py
lightx2v/utils/set_config.py
lightx2v/utils/utils.py
numpy
scipy
torch<=2.8.0
torchvision<=0.23.0
torchaudio<=2.8.0
diffusers
transformers
tokenizers
tqdm
accelerate
safetensors
opencv-python
imageio
imageio-ffmpeg
einops
loguru
qtorch
ftfy
gradio
aiohttp
pydantic
prometheus-client
gguf
fastapi
uvicorn
PyJWT
requests
aio-pika
asyncpg>=0.27.0
aioboto3>=12.0.0
alibabacloud_dypnsapi20170525==1.2.2
redis==6.4.0
tos
decord
av
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment