help="Model class to use (wan2.1: standard model, wan2.1_distill: distilled model for faster inference)",
)
)
parser.add_argument("--model_size",type=str,required=True,choices=["14b","1.3b"],help="Model type to use")
parser.add_argument("--model_size",type=str,required=True,choices=["14b","1.3b"],help="Model type to use")
parser.add_argument("--task",type=str,required=True,choices=["i2v","t2v"],help="Specify the task type. 'i2v' for image-to-video translation, 't2v' for text-to-video generation.")
parser.add_argument("--task",type=str,required=True,choices=["i2v","t2v"],help="Specify the task type. 'i2v' for image-to-video translation, 't2v' for text-to-video generation.")
Lightx2v is a lightweight video inference and generation engine that provides a web interface based on Gradio, supporting both Image-to-Video and Text-to-Video generation modes.
Lightx2v is a lightweight video inference and generation engine that provides a web interface based on Gradio, supporting both Image-to-Video and Text-to-Video generation modes.
## 📁 File Structure
```
LightX2V/app/
├── gradio_demo.py # English interface demo
├── gradio_demo_zh.py # Chinese interface demo
├── run_gradio.sh # Startup script
├── README.md # Documentation
├── saved_videos/ # Generated video save directory
└── inference_logs.log # Inference logs
```
This project contains two main demo files:
This project contains two main demo files:
-`gradio_demo.py` - English interface version
-`gradio_demo.py` - English interface version
-`gradio_demo_zh.py` - Chinese interface version
-`gradio_demo_zh.py` - Chinese interface version
## 🚀 Quick Start
## 🚀 Quick Start
### System Requirements
### Environment Requirements
- Python 3.10+ (recommended)
Follow the [Quick Start Guide](../getting_started/quickstart.md) to install the environment
- CUDA 12.4+ (recommended)
- At least 8GB GPU VRAM
- At least 16GB system memory (preferably at least 32GB)
- At least 128GB SSD solid-state drive (**💾 Strongly recommend using SSD solid-state drives to store model files! During "lazy loading" startup, significantly improves model loading speed and inference performance**)
-**Resource-constrained**: Prioritize distilled versions and lower resolutions
-**Resource-constrained**: Prioritize distilled versions and lower resolutions
-**Real-time applications**: Strongly recommend using distilled models (`wan2.1_distill`)
**🎯 Model Category Description**:
-**`wan2.1`**: Standard model, provides the best video generation quality, suitable for scenarios with extremely high quality requirements
-**`wan2.1_distill`**: Distilled model, optimized through knowledge distillation technology, significantly improves inference speed, maintains good quality while greatly reducing computation time, suitable for most application scenarios
### Startup Methods
### Startup Methods
#### Method 1: Using Startup Script (Recommended)
#### Method 1: Using Startup Script (Recommended)
**Linux Environment:**
```bash
```bash
# 1. Edit the startup script to configure relevant paths
# 1. Edit the startup script to configure relevant paths
cd app/
cd app/
...
@@ -79,41 +87,84 @@ vim run_gradio.sh
...
@@ -79,41 +87,84 @@ vim run_gradio.sh
# 2. Run the startup script
# 2. Run the startup script
bash run_gradio.sh
bash run_gradio.sh
# 3. Or start with parameters (recommended)
# 3. Or start with parameters (recommended using distilled models)
bash run_gradio.sh --task i2v --lang en --model_size 14b --port 8032
This document provides detailed instructions for deploying LightX2V locally on Windows environments.
## 📖 Overview
## System Requirements
This document provides detailed instructions for deploying LightX2V locally on Windows environments, including batch file inference, Gradio Web interface inference, and other usage methods.
Before getting started, please ensure your system meets the following requirements:
## 🚀 Quick Start
-**Operating System**: Windows 10/11
### Environment Requirements
-**Graphics Card**: NVIDIA GPU (with CUDA support)
-**VRAM**: At least 8GB VRAM
-**Memory**: At least 16GB RAM
-**Storage**: 20GB+ available disk space
-**Environment Manager**: Anaconda or Miniconda installed
First, verify your GPU driver and CUDA version by running the following command in Command Prompt:
```bash
nvidia-smi
```
Note the **CUDA Version** displayed in the output, as you'll need to match this version during subsequent installations.
### Step 2: Create Python Environment
Create an isolated conda environment, we recommend using Python 3.12:
#### Hardware Requirements
-**GPU**: NVIDIA GPU, recommended 8GB+ VRAM
-**Memory**: Recommended 16GB+ RAM
-**Storage**: Strongly recommended to use SSD solid-state drives, mechanical hard drives will cause slow model loading
```bash
#### Software Requirements
# Create new environment (using Python 3.12 as example)
-**Operating System**: Windows 10/11
conda create -n lightx2v python=3.12 -y
-**Python**: 3.12 or higher version
-**CUDA**: 12.4 or higher version
-**Dependencies**: Refer to LightX2V project's requirements_win.txt
# Activate environment
## 🎯 Usage Methods
conda activate lightx2v
```
> 💡 **Tip**: Python 3.10 or higher is recommended for optimal compatibility.
### Method 1: Using Batch File Inference
### Step 3: Install PyTorch Framework
Refer to [Quick Start Guide](../getting_started/quickstart.md) to install environment, and use [batch files](https://github.com/ModelTC/LightX2V/tree/main/scripts/win) to run.
#### Method 1: Download Official Wheel Packages (Recommended)
### Method 2: Using Gradio Web Interface Inference
1. Visit the [PyTorch Official Wheel Download Page](https://download.pytorch.org/whl/torch/)
#### Manual Gradio Configuration
2. Select the appropriate wheel package, ensuring you match:
-**Python Version**: Must match your environment (cp312 means Python 3.12)
-**CUDA Version**: Must match your GPU driver
-**Platform**: Choose Windows version (win_amd64)
**Example for Python 3.12 + PyTorch 2.6 + CUDA 12.4:**
Refer to [Quick Start Guide](../getting_started/quickstart.md) to install environment, refer to [Gradio Deployment Guide](./deploy_gradio.md)
# Model class (wan2.1: standard model, wan2.1_distill: distilled model)
model_cls=wan2.1
```bash
pip install flash-attn==2.7.2.post1
```
```
#### Option B: SageAttention 2 (Recommended)
**⚠️ Important Note**: If using distilled models (model names containing StepDistill-CfgDistil field), please set `model_cls` to `wan2.1_distill`
**Download Sources:**
-[Windows Version 1](https://github.com/woct0rdho/SageAttention/releases)
-[Windows Version 2](https://github.com/sdbds/SageAttention-for-windows/releases)
**Version Selection Guidelines:**
**🚀 Start Service**
- Python version must match
- PyTorch version must match
-**CUDA version can be flexible** (SageAttention doesn't use breaking APIs yet)
**Recommended Installation Version:**
Double-click to run the `start_lightx2v.bat` file, the script will:
1. Automatically read configuration file
2. Verify model paths and file integrity
3. Start Gradio Web interface
4. Automatically open browser to access service
```bash
**💡 Usage Suggestion**: After opening the Gradio Web page, it's recommended to check "Auto-configure Inference Options", the system will automatically select appropriate optimization configurations for your machine. When reselecting resolution, you also need to re-check "Auto-configure Inference Options".
**⚠️ Important Note**: On first run, the system will automatically extract the environment file `env.zip`, which may take several minutes. Please be patient. Subsequent launches will skip this step. You can also manually extract the `env.zip` file to the current directory to save time on first startup.
After installation, we recommend running a verification script to ensure proper functionality:
### Method 3: Using ComfyUI Inference
> 📝 **Testing**: You can also run the [official test script](https://github.com/woct0rdho/SageAttention/blob/main/tests/test_sageattn.py) for more detailed functionality verification.
This guide will instruct you on how to download and use the portable version of the Lightx2v-ComfyUI environment, so you can avoid manual environment configuration steps. This is suitable for users who want to quickly start experiencing accelerated video generation with Lightx2v on Windows systems.
### Step 6: Get LightX2V Project Code
#### Download the Windows Portable Environment:
Clone the LightX2V project from GitHub and install Windows-specific dependencies:
The portable environment already packages all Python runtime dependencies, including the code and dependencies for ComfyUI and LightX2V. After downloading, simply extract to use.
# Clone project code
git clone https://github.com/ModelTC/LightX2V.git
# Enter project directory
After extraction, the directory structure is as follows:
└── run_nvidia_gpu.bat # Windows startup script (double-click to start)
```
```
> 🔍 **Note**: We use `requirements_win.txt` instead of the standard `requirements.txt` because Windows environments may require specific package versions or additional dependencies.
#### Start ComfyUI
## Troubleshooting
### 1. CUDA Version Mismatch
**Symptoms**: CUDA-related errors occur
**Solutions**:
- Verify GPU driver supports required CUDA version
- Re-download matching wheel packages
- Use `nvidia-smi` to check maximum supported CUDA version
### 2. Dependency Conflicts
**Symptoms**: Package version conflicts or import errors
- Recreate environment and install dependencies strictly by version requirements
- Use virtual environments to isolate dependencies for different projects
### 3. Wheel Package Download Issues
**Symptoms**: Slow download speeds or connection failures
**Solutions**:
Directly double-click the run_nvidia_gpu.bat file. The system will open a Command Prompt window and run the program. The first startup may take a while, please be patient. After startup is complete, the browser will automatically open and display the ComfyUI frontend interface.
- Use download tools or browser for direct downloads
- Look for domestic mirror sources
- Check network connections and firewall settings
## Next Steps

After completing the environment setup, you can:
The plugin used by LightX2V-ComfyUI is [ComfyUI-Lightx2vWrapper](https://github.com/ModelTC/ComfyUI-Lightx2vWrapper). Example workflows can be obtained from this project.
- 📚 Check the [Quick Start Guide](../getting_started/quickstart.md)(skip environment installation steps)
#### Tested Graphics Cards (offload mode)
- 🌐 Use the [Gradio Web Interface](./deploy_gradio.md) for visual operations (skip environment installation steps)
## Version Compatibility Reference
- Tested model: `Wan2.1-I2V-14B-480P`
| Component | Recommended Version |
| GPU Model | Task Type | VRAM Capacity | Actual Max VRAM Usage | Actual Max RAM Usage |
💡 **Pro Tip**: If you encounter other issues, we recommend first checking whether all component versions match properly, as most problems stem from version incompatibilities.
In low latency scenarios, we pursue faster speed, ignoring issues such as video memory and RAM overhead. We provide two solutions:
## 💡 Solution 1: Inference with Step Distillation Model
This solution can refer to the [Step Distillation Documentation](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/step_distill.html)
🧠 **Step Distillation** is a very direct acceleration inference solution for video generation models. By distilling from 50 steps to 4 steps, the time consumption will be reduced to 4/50 of the original. At the same time, under this solution, it can still be combined with the following solutions:
## 💡 Solution 2: Inference with Non-Step Distillation Model
Step distillation requires relatively large training resources, and the model after step distillation may have degraded video dynamic range.
For the original model without step distillation, we can use the following solutions or a combination of multiple solutions for acceleration:
1.[Parallel Inference](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/parallel.html) for multi-GPU parallel acceleration.
2.[Feature Caching](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/cache.html) to reduce the actual inference steps.
3.[Efficient Attention Mechanism Solution](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/attention.html) to accelerate Attention inference.
4.[Model Quantization](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/quantization.html) to accelerate Linear layer inference.
5.[Variable Resolution Inference](https://lightx2v-en.readthedocs.io/en/latest/method_tutorials/changing_resolution.html) to reduce the resolution of intermediate inference steps.
## ⚠️ Note
Some acceleration solutions currently cannot be used together, and we are working to resolve this issue.
If you have any questions, feel free to report bugs or request features in [🐛 GitHub Issues](https://github.com/ModelTC/lightx2v/issues)