update docs

773a7149 · GoatWu · 57013647 · 773a7149 · 773a7149
Commit 773a7149 authored Jul 11, 2025 by GoatWu
2 changed files
--- a/docs/EN/source/method_tutorials/autoregressive_distill.md
+++ b/docs/EN/source/method_tutorials/autoregressive_distill.md
-# 自回归蒸馏
+# Autoregressive Distillation

-xxx
+Autoregressive distillation is a technical exploration in LightX2V. By training distilled models, it reduces inference steps from the original 40-50 steps to **8 steps**, achieving inference acceleration while enabling infinite-length video generation through KV Cache technology.
+
+> ⚠️ Warning: Currently, autoregressive distillation has mediocre effects and the acceleration improvement has not met expectations, but it can serve as a long-term research project. Currently, LightX2V only supports autoregressive models for T2V.
+
+## 🔍 Technical Principle
+
+Autoregressive distillation is implemented through [CausVid](https://github.com/tianweiy/CausVid) technology. CausVid performs step distillation and CFG distillation on 1.3B autoregressive models. LightX2V extends it with a series of enhancements:
+
+1. **Larger Models**: Supports autoregressive distillation training for 14B models;
+2. **More Complete Data Processing Pipeline**: Generates a training dataset of 50,000 prompt-video pairs;
+
+For detailed implementation, refer to [CausVid-Plus](https://github.com/GoatWu/CausVid-Plus).
+
+## 🛠️ Configuration Files
+
+### Configuration File
+
+Configuration options are provided in the [configs/causvid/](https://github.com/ModelTC/lightx2v/tree/main/configs/causvid) directory:
+
+| Configuration File | Model Address |
+|-------------------|---------------|
+| [wan_t2v_causvid.json](https://github.com/ModelTC/lightx2v/blob/main/configs/causvid/wan_t2v_causvid.json) | https://huggingface.co/lightx2v/Wan2.1-T2V-14B-CausVid |
+
+### Key Configuration Parameters
+
+```json
+{
+  "enable_cfg": false,          // Disable CFG for speed improvement
+  "num_fragments": 3,           // Number of video segments generated at once, 5s each
+  "num_frames": 21,             // Frames per video segment, modify with caution!
+  "num_frame_per_block": 3,     // Frames per autoregressive block, modify with caution!
+  "num_blocks": 7,              // Autoregressive blocks per video segment, modify with caution!
+  "frame_seq_length": 1560,     // Encoding length per frame, modify with caution!
+  "denoising_step_list": [      // Denoising timestep list
+    999, 934, 862, 756, 603, 410, 250, 140, 74
+  ]
+}
+```
+
+## 📜 Usage
+
+### Model Preparation
+
+Place the downloaded model (`causal_model.pt` or `causal_model.safetensors`) in the `causvid_models/` folder under the Wan model root directory:
+- For T2V: `Wan2.1-T2V-14B/causvid_models/`
+
+### Inference Script
+
+```bash
+bash scripts/wan/run_wan_t2v_causvid.sh
+```
--- a/docs/ZH_CN/source/method_tutorials/autoregressive_distill.md
+++ b/docs/ZH_CN/source/method_tutorials/autoregressive_distill.md
 # 自回归蒸馏

-xxx
+自回归蒸馏是 LightX2V 中的一个技术探索，通过训练蒸馏模型将推理步数从原始的 40-50 步减少到 **8 步**，在实现推理加速的同时能够通过 KV Cache 技术生成无限长视频。
+
+> ⚠️ 警告：目前自回归蒸馏的效果一般，加速效果也没有达到预期，但是可以作为一个长期的研究项目。目前 LightX2V 仅支持 T2V 的自回归模型。
+
+## 🔍 技术原理
+
+自回归蒸馏通过 [CausVid](https://github.com/tianweiy/CausVid) 技术实现。CausVid 针对 1.3B 的自回归模型进行步数蒸馏、CFG蒸馏。LightX2V 在其基础上，进行了一系列扩展：
+
+1. **更大的模型**：支持 14B 模型的自回归蒸馏训练；
+2. **更完整的数据处理流程**：生成50000个提示词-视频对的训练数据集；
+
+具体实现可参考 [CausVid-Plus](https://github.com/GoatWu/CausVid-Plus)。
+
+## 🛠️ 配置文件说明
+
+### 配置文件
+
+在 [configs/causvid/](https://github.com/ModelTC/lightx2v/tree/main/configs/causvid) 目录下提供了配置选项：
+
+| 配置文件 | 模型地址 |
+|----------|------------|
+| [wan_t2v_causvid.json](https://github.com/ModelTC/lightx2v/blob/main/configs/causvid/wan_t2v_causvid.json) | https://huggingface.co/lightx2v/Wan2.1-T2V-14B-CausVid |
+
+### 关键配置参数
+
+```json
+{
+  "enable_cfg": false,          // 关闭CFG以提升速度
+  "num_fragments": 3,           // 一次生成视频的段数，每段5s
+  "num_frames": 21,             // 每段视频的帧数，谨慎修改！
+  "num_frame_per_block": 3,     // 每个自回归块的帧数，谨慎修改！
+  "num_blocks": 7,              // 每段视频的自回归块数，谨慎修改！
+  "frame_seq_length": 1560,     // 每帧的编码长度，谨慎修改！
+  "denoising_step_list": [      // 去噪时间步列表
+    999, 934, 862, 756, 603, 410, 250, 140, 74
+  ]
+}
+```
+
+## 📜 使用方法
+
+### 模型准备
+
+将下载好的模型（`causal_model.pt` 或者 `causal_model.safetensors`）放到 Wan 模型根目录的 `causvid_models/` 文件夹下即可
+- 对于 T2V：`Wan2.1-T2V-14B/causvid_models/`
+
+### 推理脚本
+
+```bash
+bash scripts/wan/run_wan_t2v_causvid.sh
+```