".github/vscode:/vscode.git/clone" did not exist on "b99fb8be615bc98c6915bbe06a1e0092cbc074a5"
  • Lei Wang's avatar
    [Refactor] refactor autotune examples (#617) · d110d087
    Lei Wang authored
    * [Refactor] Update tilelang kernel functions and remove unused imports
    
    - Refactored the `flashattn_fwd`, `flashattn_bwd_preprocess`, and `flashattn_bwd_postprocess` functions to utilize direct kernel calls instead of cached versions, improving clarity and performance.
    - Added `@tilelang.jit` decorators with specified output indices to enhance kernel compilation.
    - Removed unused import of `cached` from `tilelang`, streamlining the code.
    - Commented out the main testing function call in `test_tilelang_kernel_mha_bwd.py` for potential future use.
    
    * [Refactor] Simplify configuration generation in benchmark and example scripts
    
    - Refactored the `get_configs` functions in multiple benchmark and example scripts to utilize a dictionary-based approach for parameter configuration, improving readability and maintainability.
    - Updated the `flashattn` and `chunk_scan_fwd` functions to directly accept configuration parameters, enhancing flexibility in kernel tuning.
    - Removed redundant code and streamlined the configuration generation process across various files, ensuring consistency in how configurations are defined and utilized.
    
    * [Refactor] Update configuration handling in benchmark scripts
    
    - Refactored the `get_configs` functions in benchmark scripts to accept a variable argument list, improving flexibility in configuration management.
    - Enhanced the `matmul` and `flashattn` functions to utilize the updated configuration approach, streamlining parameter handling for kernel tuning.
    - Added `@autotune` decorators to relevant functions, ensuring consistent autotuning behavior across benchmarks.
    - Cleaned up redundant code and improved overall readability in the affected files.
    
    * [Refactor] Clean up formatting and update subproject commit
    
    - Updated the subproject commit reference in the TVM directory to indicate a dirty state.
    - Removed unnecessary blank lines and improved formatting in the `benchmark_matmul` and `benchmark_matmul_fp8` scripts for better readability.
    - Streamlined the function definitions in the `flashattn` example script to enhance clarity and maintainability.
    
    * [Refactor] Update AutoTuner configuration handling
    
    - Modified the AutoTuner class to check if kernel parameters are set before processing tunable arguments, improving robustness in configuration handling.
    - Enhanced the logic for skipping compilation when tunable parameters are already provided, ensuring efficient use of resources.
    - Updated comments for clarity and maintainability.
    
    * lint fix
    
    * Update TVM subproject commit to indicate dirty state and modify MHA backward test cases
    
    - Updated the subproject commit reference in the TVM directory to reflect a dirty state.
    - Adjusted the `test_mha_bwd` function to use a new configuration for the MHA backward tests, changing the context size from 128 to 256.
    - Uncommented the main testing function call for potential execution.
    d110d087
benchmark_matmul.py 7.9 KB