• Lei Wang's avatar
    [Bugfix] Cast bool dtype into int8 in blocksparse examples (#167) · b6c48453
    Lei Wang authored
    * [Refactor] Update BitBLAS Benchmark with TileLang Carver Imports and Roller Hints Generation
    
    - Replace BitBLAS imports with TileLang Carver imports in benchmark_matmul.py
    - Modify roller hints generation using new TileLang Carver template and utility functions
    - Update get_roller_hints_from_func to handle None cases and improve return logic
    - Adjust DefaultPolicy to handle different codegen dictionary formats
    
    * [Refactor] Update Thread Binding and Import Statements in TileLang Kernels
    
    - Replace T.thread_binding() with T.get_thread_binding() across multiple kernel test files
    - Update import statements for MMA layout and macro generator in dequantize GEMM and FP8 examples
    - Move map_torch_type utility function to tilelang.utils.tensor
    - Remove unnecessary imports and improve code organization
    
    * Refactor Native Sparse Attention Example with Enhanced Triton Kernel
    
    - Update parallel_nsa_fwd_kernel to support more flexible sparse attention computation
    - Add support for block counts and offsets in the Triton kernel
    - Modify kernel grid and computation logic for improved performance
    - Update example script to use naive_nsa_simple reference implementation
    - Improve type hints and kernel configuration
    b6c48453
example_triton_nsa.py 7.59 KB