• Lei Wang's avatar
    [Example] Implement simple block sparse kernel (#106) · c7462abf
    Lei Wang authored
    * Remove Torch CPP backend and update execution backend options
    
    - Remove TorchCPPKernelAdapter and related code from JIT modules
    - Update execution backend options in jit/__init__.py, kernel.py, and adapter/__init__.py
    - Remove "torch_cpp" from supported execution backend literals
    - Simplify backend validation and remove unused torch_cpp-related code
    。
    
    * lint fix
    
    * Add block sparse attention implementations for TileLang and Triton
    
    - Implement block sparse attention kernels for TileLang and Triton
    - Add example scripts for block sparse attention with top-k and threshold-based masking
    - Include utility functions for generating sparse attention masks
    - Demonstrate causal attention with block-level sparsity
    - Add test cases to validate sparse attention implementations against PyTorch reference
    c7462abf
example_mha_fwd_bshd.py 9.44 KB