Autoregressive distillation is a technical exploration in LightX2V. By training distilled models, it reduces inference steps from the original 40-50 steps to **8 steps**, achieving inference acceleration while enabling infinite-length video generation through KV Cache technology.
> ⚠️ Warning: Currently, autoregressive distillation has mediocre effects and the acceleration improvement has not met expectations, but it can serve as a long-term research project. Currently, LightX2V only supports autoregressive models for T2V.
## 🔍 Technical Principle
Autoregressive distillation is implemented through [CausVid](https://github.com/tianweiy/CausVid) technology. CausVid performs step distillation and CFG distillation on 1.3B autoregressive models. LightX2V extends it with a series of enhancements:
1.**Larger Models**: Supports autoregressive distillation training for 14B models;
2.**More Complete Data Processing Pipeline**: Generates a training dataset of 50,000 prompt-video pairs;
For detailed implementation, refer to [CausVid-Plus](https://github.com/GoatWu/CausVid-Plus).
## 🛠️ Configuration Files
### Configuration File
Configuration options are provided in the [configs/causvid/](https://github.com/ModelTC/lightx2v/tree/main/configs/causvid) directory: