Commit 0c18f343 authored by helloyongyang's avatar helloyongyang
Browse files

update attention docs

parent 259085cc
{
"infer_steps": 40,
"target_video_length": 81,
"target_height": 480,
"target_width": 832,
"self_attn_1_type": "radial_attn",
"cross_attn_1_type": "flash_attn3",
"cross_attn_2_type": "flash_attn3",
"seed": 42,
"sample_guide_scale": 5,
"sample_shift": 5,
"enable_cfg": true,
"cpu_offload": false
}
# 🎯 Attention Type Configuration in DiT Model # Attention Mechanisms
The DiT model in `LightX2V` currently uses three types of attention mechanisms. Each type of attention can be configured with a specific backend library. ## Attention Mechanisms Supported by LightX2V
--- | Name | Type Name | GitHub Link |
|--------------------|------------------|-------------|
## Attention Usage Locations | Flash Attention 2 | `flash_attn2` | [flash-attention v2](https://github.com/Dao-AILab/flash-attention) |
| Flash Attention 3 | `flash_attn3` | [flash-attention v3](https://github.com/Dao-AILab/flash-attention) |
1. **Self-Attention on the image** | Sage Attention 2 | `sage_attn2` | [SageAttention](https://github.com/thu-ml/SageAttention) |
- Configuration key: `self_attn_1_type` | Radial Attention | `radial_attn` | [Radial Attention](https://github.com/mit-han-lab/radial-attention) |
| Sparge Attention | `sparge_ckpt` | [Sparge Attention](https://github.com/thu-ml/SpargeAttn) |
2. **Cross-Attention between image and prompt text**
- Configuration key: `cross_attn_1_type`
3. **Cross-Attention between image and reference image (in I2V mode)**
- Configuration key: `cross_attn_2_type`
--- ---
## 🚀 Supported Attention Backends ## Configuration Examples
| Name | Type Identifier | GitHub Link | The configuration files for attention mechanisms are located [here](https://github.com/ModelTC/lightx2v/tree/main/configs/attentions)
|--------------------|-------------------|-------------|
| Flash Attention 2 | `flash_attn2` | [flash-attention v2](https://github.com/Dao-AILab/flash-attention) |
| Flash Attention 3 | `flash_attn3` | [flash-attention v3](https://github.com/Dao-AILab/flash-attention) |
| Sage Attention 2 | `sage_attn2` | [SageAttention](https://github.com/thu-ml/SageAttention) |
| Radial Attention | `radial_attn` | [Radial Attention](https://github.com/mit-han-lab/radial-attention) |
| Sparge Attention | `sparge_ckpt` | [Sparge Attention](https://github.com/thu-ml/SpargeAttn) |
---
## 🛠️ Configuration Example By specifying --config_json to a specific config file, you can test different attention mechanisms.
In the `wan_i2v.json` configuration file, you can specify the attention types as follows: For example, for radial_attn, the configuration is as follows:
```json ```json
{ {
...@@ -41,26 +28,8 @@ In the `wan_i2v.json` configuration file, you can specify the attention types as ...@@ -41,26 +28,8 @@ In the `wan_i2v.json` configuration file, you can specify the attention types as
} }
``` ```
To use other attention backends, simply replace the values with the appropriate type identifiers listed above. To switch to other types, simply replace the corresponding values with the type names from the table above.
Tip: Due to the limitations of the sparse algorithm's principle, radial_attn can only be used in self-attention.
---
For Sparge Attention like `wan_t2v_sparge.json` configuration file:
Sparge Attention need PostTrain weight path
```json Tips: radial_attn can only be used in self attention due to the limitations of its sparse algorithm principle.
{
"self_attn_1_type": "flash_attn3",
"cross_attn_1_type": "flash_attn3",
"cross_attn_2_type": "flash_attn3"
"sparge": true,
"sparge_ckpt": "/path/to/sparge_wan2.1_t2v_1.3B.pt"
}
```
---
For further customization or behavior tuning, please refer to the official documentation of the respective attention libraries. For further customization of attention mechanism behavior, please refer to the official documentation or implementation code of each attention library.
# 🎯 DiT 模型中的注意力类型配置说明 # 注意力机制
当前 DiT 模型在 `LightX2V` 中三个地方使用到了注意力,每个注意力可以分别配置底层注意力库类型。 ## LightX2V支持的注意力机制
---
## 使用注意力的位置
1. **图像的自注意力(Self-Attention)**
- 配置参数:`self_attn_1_type`
2. **图像与提示词(Text)之间的交叉注意力(Cross-Attention)**
- 配置参数:`cross_attn_1_type`
3. **I2V 模式下图像与参考图(Reference)之间的交叉注意力**
- 配置参数:`cross_attn_2_type`
---
## 🚀 支持的注意力库(Backend)
| 名称 | 类型名称 | GitHub 链接 | | 名称 | 类型名称 | GitHub 链接 |
|--------------------|------------------|-------------| |--------------------|------------------|-------------|
...@@ -29,9 +12,13 @@ ...@@ -29,9 +12,13 @@
--- ---
## 🛠️ 配置示例 ## 配置示例
`wan_i2v.json` 配置文件中,可以通过如下方式指定使用的注意力类型: 注意力机制的config文件在[这里](https://github.com/ModelTC/lightx2v/tree/main/configs/attentions)
通过指定--config_json到具体的config文件,即可以测试不同的注意力机制
比如对于radial_attn,配置如下:
```json ```json
{ {
...@@ -45,22 +32,4 @@ ...@@ -45,22 +32,4 @@
tips: radial_attn因为稀疏算法原理的限制只能用在self attention tips: radial_attn因为稀疏算法原理的限制只能用在self attention
---
对于 Sparge Attention 配置参考 `wan_t2v_sparge.json` 文件:
Sparge Attention是需要后一个训练的权重
```json
{
"self_attn_1_type": "flash_attn3",
"cross_attn_1_type": "flash_attn3",
"cross_attn_2_type": "flash_attn3"
"sparge": true,
"sparge_ckpt": "/path/to/sparge_wan2.1_t2v_1.3B.pt"
}
```
---
如需进一步定制注意力机制的行为,请参考各注意力库的官方文档或实现代码。 如需进一步定制注意力机制的行为,请参考各注意力库的官方文档或实现代码。
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment