attention.md 4.03 KB
Newer Older
xuwx1's avatar
xuwx1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# 注意力机制

### Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

[paper](https://arxiv.org/abs/2502.01776) | [code](https://github.com/svg-project/Sparse-VideoGen)

### Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

[paper](https://arxiv.org/abs/2505.18875)

### Training-free and Adaptive Sparse Attention for Efficient Long Video Generation

[paper](https://arxiv.org/abs/2502.21079)

### DSV: Exploiting Dynamic Sparsity to Accelerate Large-Scale Video DiT Training

[paper](https://arxiv.org/abs/2502.07590)

### MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention

[paper](https://github.com/microsoft/MInference)

### FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion

[paper](https://arxiv.org/abs/2506.04648)

### VORTA: Efficient Video Diffusion via Routing Sparse Attention

[paper](https://arxiv.org/abs/2505.18809)

### Training-Free Efficient Video Generation via Dynamic Token Carving

[paper](https://arxiv.org/abs/2505.16864)

### RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy

[paper](https://arxiv.org/abs/2505.21036)

### Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation

[paper](https://arxiv.org/abs/2506.19852)

### VMoBA: Mixture-of-Block Attention for Video Diffusion Models

[paper](https://arxiv.org/abs/2506.23858)

### SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference

[paper](https://arxiv.org/abs/2502.18137) | [code](https://github.com/thu-ml/SpargeAttn)

### Fast Video Generation with Sliding Tile Attention

[paper](https://arxiv.org/abs/2502.04507) | [code](https://github.com/hao-ai-lab/FastVideo)

### PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models

[paper](https://arxiv.org/abs/2506.16054)

### Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light

[paper](https://arxiv.org/abs/2504.16922)

### Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers

[paper](https://arxiv.org/abs/2506.05096)

### ∇NABLA: Neighborhood Adaptive Block-Level Attention

[paper](https://arxiv.org/abs/2507.13546v1) [code](https://github.com/gen-ai-team/Wan2.1-NABLA)

### Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation

[paper](https://arxiv.org/abs/2508.12969)

### A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention

[paper](https://attention-survey.github.io/files/Attention_Survey.pdf)

### Bidirectional Sparse Attention for Faster Video Diffusion Training

[paper](https://arxiv.org/abs/2509.01085)

### Mixture of Contexts for Long Video Generation

[paper](https://arxiv.org/abs/2508.21058)

### LoViC: Efficient Long Video Generation with Context Compression

[paper](https://arxiv.org/abs/2507.12952)

### MagiAttention: A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Mask Training

[paper](https://sandai-org.github.io/MagiAttention/blog/) [code](https://github.com/SandAI-org/MagiAttention)

### DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance

[paper](https://arxiv.org/abs/2505.14708) [code](https://github.com/shawnricecake/draft-attention)

### XAttention: Block Sparse Attention with Antidiagonal Scoring

[paper](https://arxiv.org/abs/2503.16428) [code](https://github.com/mit-han-lab/x-attention)

### VSA: Faster Video Diffusion with Trainable Sparse Attention

[paper](https://arxiv.org/abs/2505.13389) [code](https://github.com/hao-ai-lab/FastVideo)

### QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification

[paper](https://arxiv.org/abs/2509.23681)

### SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

[paper](https://arxiv.org/abs/2509.24006)