• Tong WU's avatar
    [Refactor][Example] Update linear attention examples and add tests (#1010) · ae9a6f0a
    Tong WU authored
    
    
    * [Refactor][Example] Update linear attention examples and add tests
    
    - Refactored the backward and forward linear attention kernels to use shared memory and atomic additions for improved performance.
    - Introduced L2 normalization in the main functions of both examples.
    - Added a new test suite for the linear attention examples to ensure correctness and performance.
    - Updated argument parsing in the main functions for better usability.
    
    * upd docstring for tma atomic add
    
    * lint
    
    * Add flash-linear-attention dependency to requirements.txt
    
    * Rename main function to chunk_linear_attn_bwd
    
    * Rename main function to chunk_linear_attn_fwd
    
    * chore
    
    ---------
    Co-authored-by: default avatarLeiWang1999 <leiwang1999@outlook.com>
    Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
    ae9a6f0a
example_linear_attn_fwd.py 5.85 KB