Merge branch 'main' of github.com:ModelTC/lightx2v into main

34df26f6 · GoatWu · 9a53e8e3 · a5cb22ce · 34df26f6 · 34df26f6
Commit 34df26f6 authored Jul 11, 2025 by GoatWu
9 changed files
--- a/configs/attns/wan_i2v_flash.json
+++ b/configs/attns/wan_i2v_flash.json
--- a/configs/attentions/wan_i2v_radial.json
+++ b/configs/attentions/wan_i2v_radial.json
+{
+    "infer_steps": 40,
+    "target_video_length": 81,
+    "target_height": 480,
+    "target_width": 832,
+    "self_attn_1_type": "radial_attn",
+    "cross_attn_1_type": "flash_attn3",
+    "cross_attn_2_type": "flash_attn3",
+    "seed": 42,
+    "sample_guide_scale": 5,
+    "sample_shift": 5,
+    "enable_cfg": true,
+    "cpu_offload": false
+}
--- a/configs/attns/wan_i2v_sage.json
+++ b/configs/attns/wan_i2v_sage.json
--- a/configs/attns/wan_t2v_sparge.json
+++ b/configs/attns/wan_t2v_sparge.json
--- a/configs/attns/readme.md
+++ b/configs/attns/readme.md
-### TODO
--- a/docs/EN/source/method_tutorials/attention.md
+++ b/docs/EN/source/method_tutorials/attention.md
-# 🎯 Attention Type Configuration in DiT Model
+# Attention Mechanisms

-The DiT model in `LightX2V` currently uses three types of attention mechanisms. Each type of attention can be configured with a specific backend library.
+## Attention Mechanisms Supported by LightX2V

---
-
-## Attention Usage Locations
-
-1. **Self-Attention on the image**
-   - Configuration key: `self_attn_1_type`
-
-2. **Cross-Attention between image and prompt text**
-   - Configuration key: `cross_attn_1_type`
-
-3. **Cross-Attention between image and reference image (in I2V mode)**
-   - Configuration key: `cross_attn_2_type`
+| Name               | Type Name        | GitHub Link |
+|--------------------|------------------|-------------|
+| Flash Attention 2  | `flash_attn2`    | [flash-attention v2](https://github.com/Dao-AILab/flash-attention) |
+| Flash Attention 3  | `flash_attn3`    | [flash-attention v3](https://github.com/Dao-AILab/flash-attention) |
+| Sage Attention 2   | `sage_attn2`     | [SageAttention](https://github.com/thu-ml/SageAttention) |
+| Radial Attention   | `radial_attn`    | [Radial Attention](https://github.com/mit-han-lab/radial-attention) |
+| Sparge Attention   | `sparge_ckpt`     | [Sparge Attention](https://github.com/thu-ml/SpargeAttn) |

 ---

-## 🚀 Supported Attention Backends
+## Configuration Examples

-| Name               | Type Identifier   | GitHub Link |
-|--------------------|-------------------|-------------|
-| Flash Attention 2  | `flash_attn2`     | [flash-attention v2](https://github.com/Dao-AILab/flash-attention) |
-| Flash Attention 3  | `flash_attn3`     | [flash-attention v3](https://github.com/Dao-AILab/flash-attention) |
-| Sage Attention 2   | `sage_attn2`      | [SageAttention](https://github.com/thu-ml/SageAttention) |
-| Radial Attention   | `radial_attn`     | [Radial Attention](https://github.com/mit-han-lab/radial-attention) |
-| Sparge Attention   | `sparge_ckpt`     | [Sparge Attention](https://github.com/thu-ml/SpargeAttn) |
-
---
+The configuration files for attention mechanisms are located [here](https://github.com/ModelTC/lightx2v/tree/main/configs/attentions)

-## 🛠️ Configuration Example
+By specifying --config_json to a specific config file, you can test different attention mechanisms.

-In the `wan_i2v.json` configuration file, you can specify the attention types as follows:
+For example, for radial_attn, the configuration is as follows:

 ```json
 {
@@ -41,26 +28,8 @@ In the `wan_i2v.json` configuration file, you can specify the attention types as
 }
 ```

-To use other attention backends, simply replace the values with the appropriate type identifiers listed above.
-
-Tip: Due to the limitations of the sparse algorithm's principle, radial_attn can only be used in self-attention.
-
---
-
-For Sparge Attention like `wan_t2v_sparge.json` configuration file:
-
-   Sparge Attention need PostTrain weight path
+To switch to other types, simply replace the corresponding values with the type names from the table above.

-```json
-{
-  "self_attn_1_type": "flash_attn3",
-  "cross_attn_1_type": "flash_attn3",
-  "cross_attn_2_type": "flash_attn3"
-  "sparge": true,
-  "sparge_ckpt": "/path/to/sparge_wan2.1_t2v_1.3B.pt"
-}
-```
-
---
+Tips: radial_attn can only be used in self attention due to the limitations of its sparse algorithm principle.

-For further customization or behavior tuning, please refer to the official documentation of the respective attention libraries.
+For further customization of attention mechanism behavior, please refer to the official documentation or implementation code of each attention library.
--- a/docs/EN/source/method_tutorials/step_distill.md
+++ b/docs/EN/source/method_tutorials/step_distill.md
-# 🚀 Step Distillation
+# Step Distillation

 Step distillation is an important optimization technique in LightX2V. By training distilled models, it significantly reduces inference steps from the original 40-50 steps to **4 steps**, dramatically improving inference speed while maintaining video quality. LightX2V implements step distillation along with CFG distillation to further enhance inference speed.


--- a/docs/ZH_CN/source/method_tutorials/attention.md
+++ b/docs/ZH_CN/source/method_tutorials/attention.md
-# 🎯 DiT 模型中的注意力类型配置说明
+# 注意力机制

-当前 DiT 模型在 `LightX2V` 中三个地方使用到了注意力，每个注意力可以分别配置底层注意力库类型。
-
---
-
-## 使用注意力的位置
-
-1. **图像的自注意力（Self-Attention）**
-   - 配置参数：`self_attn_1_type`
-
-2. **图像与提示词（Text）之间的交叉注意力（Cross-Attention）**
-   - 配置参数：`cross_attn_1_type`
-
-3. **I2V 模式下图像与参考图（Reference）之间的交叉注意力**
-   - 配置参数：`cross_attn_2_type`
-
---
-
-## 🚀 支持的注意力库（Backend）
+## LightX2V支持的注意力机制

 | 名称               | 类型名称         | GitHub 链接 |
 |--------------------|------------------|-------------|
@@ -29,9 +12,13 @@

 ---

-## 🛠️ 配置示例
+## 配置示例

-在 `wan_i2v.json` 配置文件中，可以通过如下方式指定使用的注意力类型：
+注意力机制的config文件在[这里](https://github.com/ModelTC/lightx2v/tree/main/configs/attentions)
+
+通过指定--config_json到具体的config文件，即可以测试不同的注意力机制
+
+比如对于radial_attn，配置如下：

 ```json
 {
@@ -45,22 +32,4 @@

 tips: radial_attn因为稀疏算法原理的限制只能用在self attention

---
-
-对于 Sparge Attention 配置参考 `wan_t2v_sparge.json` 文件:
-
-    Sparge Attention是需要后一个训练的权重
-
-```json
-{
-  "self_attn_1_type": "flash_attn3",
-  "cross_attn_1_type": "flash_attn3",
-  "cross_attn_2_type": "flash_attn3"
-  "sparge": true,
-  "sparge_ckpt": "/path/to/sparge_wan2.1_t2v_1.3B.pt"
-}
-```
-
---
-
 如需进一步定制注意力机制的行为，请参考各注意力库的官方文档或实现代码。
--- a/docs/ZH_CN/source/method_tutorials/step_distill.md
+++ b/docs/ZH_CN/source/method_tutorials/step_distill.md
-# 🚀 步数蒸馏
+# 步数蒸馏

 步数蒸馏是 LightX2V 中的一项重要优化技术，通过训练蒸馏模型将推理步数从原始的 40-50 步大幅减少到 **4 步**，在保持视频质量的同时显著提升推理速度。LightX2V 在实现步数蒸馏的同时也加入了 CFG 蒸馏，进一步提升推理速度。