1. 13 Jul, 2025 1 commit
    • Lei Wang's avatar
      [AutoTune] Support `with set_autotune_inputs` to set auto tuning input tensors (#632) · eec47592
      Lei Wang authored
      * [Refactor] Simplify and modularize autotuner implementation
      
      - Removed unused imports and extensive code sections from the autotuner module to enhance readability and maintainability.
      - Modularized the code by introducing new imports for autotuning and capturing functionalities, streamlining the overall structure.
      - Improved logging setup and removed redundant timeout handling functions, focusing on core autotuning logic.
      - Updated the AutoTuner class to better utilize the new modular structure, ensuring efficient performance during auto-tuning processes.
      
      * [Refactor] Clean up and enhance capture and tuner modules
      
      - Improved code readability by removing unnecessary blank lines and organizing imports in `capture.py` and `tuner.py`.
      - Enhanced logging in the `AutoTuner` class to provide clearer warnings regarding the usage of `supply_prog` in the context of auto-tuning.
      - Streamlined the `CaptureStack` class for better thread-local context management.
      
      * lint fix
      
      * [Refactor] Simplify configuration and autotuning logic in blocksparse GEMM example
      
      - Updated `get_configs` function to reduce the number of configurations, enhancing performance and clarity.
      - Removed the `get_best_config` function, integrating its logic directly into the `blocksparse_matmul` function with the `@autotune` decorator for streamlined autotuning.
      - Adjusted the main function to directly utilize the autotuned kernel, simplifying the overall structure and improving readability.
      - Deleted obsolete test file for autotuning decorator, cleaning up the codebase.
      
      * [Refactor] Improve code formatting and readability in autotune test file
      
      - Reformatted the `matmul` function and `get_configs` function for better readability by adjusting line breaks and indentation.
      - Fixed a typo in the `enable_rasteration` parameter name to ensure consistency.
      - Cleaned up unnecessary blank lines to enhance overall code clarity.
      
      * Update example_blocksparse_gemm.py
      
      * Update capture.py
      eec47592