[Refactor][Example] Update linear attention examples and add tests (#1010)
* [Refactor][Example] Update linear attention examples and add tests - Refactored the backward and forward linear attention kernels to use shared memory and atomic additions for improved performance. - Introduced L2 normalization in the main functions of both examples. - Added a new test suite for the linear attention examples to ensure correctness and performance. - Updated argument parsing in the main functions for better usability. * upd docstring for tma atomic add * lint * Add flash-linear-attention dependency to requirements.txt * Rename main function to chunk_linear_attn_bwd * Rename main function to chunk_linear_attn_fwd * chore --------- Co-authored-by:LeiWang1999 <leiwang1999@outlook.com> Co-authored-by:
Lei Wang <34334180+LeiWang1999@users.noreply.github.com>
Showing
| ... | ... | @@ -7,3 +7,4 @@ torch |
| torch>=2.7; platform_system == 'Darwin' | ||
| tqdm>=4.62.3 | ||
| typing-extensions>=4.10.0 | ||
| flash-linear-attention==0.3.2 | ||
| \ No newline at end of file |
Please register or sign in to comment