"src/targets/gpu/vscode:/vscode.git/clone" did not exist on "51597ed76fce131a8f39b739bd18e84108e4443c"
  • Lei Wang's avatar
    [Bugfix] Cast bool dtype into int8 in blocksparse examples (#167) · b6c48453
    Lei Wang authored
    * [Refactor] Update BitBLAS Benchmark with TileLang Carver Imports and Roller Hints Generation
    
    - Replace BitBLAS imports with TileLang Carver imports in benchmark_matmul.py
    - Modify roller hints generation using new TileLang Carver template and utility functions
    - Update get_roller_hints_from_func to handle None cases and improve return logic
    - Adjust DefaultPolicy to handle different codegen dictionary formats
    
    * [Refactor] Update Thread Binding and Import Statements in TileLang Kernels
    
    - Replace T.thread_binding() with T.get_thread_binding() across multiple kernel test files
    - Update import statements for MMA layout and macro generator in dequantize GEMM and FP8 examples
    - Move map_torch_type utility function to tilelang.utils.tensor
    - Remove unnecessary imports and improve code organization
    
    * Refactor Native Sparse Attention Example with Enhanced Triton Kernel
    
    - Update parallel_nsa_fwd_kernel to support more flexible sparse attention computation
    - Add support for block counts and offsets in the Triton kernel
    - Modify kernel grid and computation logic for improved performance
    - Update example script to use naive_nsa_simple reference implementation
    - Improve type hints and kernel configuration
    b6c48453
block_sparse_attn_tilelang.py 11.1 KB