• Lei Wang's avatar
    [Autotune] Introduce cache mechanism for auto tuner (#527) · 7171aff6
    Lei Wang authored
    * [Enhancement] Add commit ID to versioning and improve logging initialization
    
    * Updated `get_tilelang_version` to include an optional commit ID in the version string.
    * Enhanced the `TileLangBuilPydCommand` to write the version with commit ID to the VERSION file during the build process.
    * Introduced a new function `get_git_commit_id` in `version.py` to retrieve the current git commit hash.
    * Refactored logger initialization in `autotuner/__init__.py` to ensure handlers are set up only once, improving performance and clarity.
    * Minor fixes in `flatten_buffer.cc` and `kernel_cache.py` for better handling of versioning and logging.
    
    * [Refactor] Enhance AutoTuner and JITKernel for improved performance and caching
    
    * Refactored the AutoTuner class to include new methods for setting compilation and profiling arguments, enhancing configurability.
    * Introduced caching mechanisms for tuning results, allowing for faster retrieval of previously computed configurations.
    * Updated JITKernel to store tuning results, including latency and configuration details, improving the kernel's performance tracking.
    * Added new methods for generating cache keys and saving/loading results to/from disk, streamlining the tuning process.
    * Enhanced the overall structure and readability of the autotuning logic, ensuring better maintainability and clarity.
    * Minor adjustments in related modules to support the new caching and profiling features.
    
    * [Refactor] Clean up code formatting and improve readability in AutoTuner and related modules
    
    * Consolidated import statements and removed unnecessary line breaks for better readability.
    * Standardized function argument formatting across the AutoTuner and CompileArgs classes.
    * Enhanced consistency in the use of whitespace and indentation throughout the codebase.
    * Minor adjustments in the Profiler and JITKernel classes to improve clarity and maintainability.
    * Ensured that all changes adhere to the project's coding style guidelines.
    
    * [Refactor] Remove redundant type hints in AutoTuner modules
    
    * Simplified import statements in `__init__.py` and `param.py` by removing unnecessary duplicate type hints for `Any`.
    * Improved code readability and maintainability by streamlining type imports across the AutoTuner module.
    
    * [Refactor] Update AutoTuner configuration for improved profiling and target detection
    
    * Enhanced the AutoTuner configuration across multiple examples by adding `set_profile_args` to better manage profiling settings.
    * Standardized the use of `target="auto"` in compile arguments to ensure automatic target detection.
    * Removed redundant target specifications in certain instances to streamline the configuration process.
    * Improved overall clarity and maintainability of the autotuning logic in various example scripts.
    
    * [Refactor] Simplify code formatting and improve readability in example scripts
    
    * Consolidated function argument formatting in `benchmark_mla_decode_amd_tilelang.py`, `example_elementwise_add.py`, and `performance.py` for better clarity.
    * Removed unnecessary line breaks and standardized argument placement across multiple files.
    * Enhanced overall code readability and maintainability in autotuning examples and performance scripts.
    
    * [Refactor] Update JIT decorator usage across multiple files
    
    * Removed redundant parameters from the JIT decorator in various benchmark and example scripts, simplifying the code.
    * Standardized the import of the JIT decorator from `tilelang`, enhancing consistency across the codebase.
    * Improved overall readability and maintainability by consolidating import statements and cleaning up function definitions.
    
    * [Refactor] Standardize JIT decorator formatting across benchmark and example scripts
    
    * Simplified the formatting of the JIT decorator in multiple files by removing unnecessary line breaks.
    * Enhanced code readability and consistency in the usage of the JIT decorator across benchmark and example scripts.
    * Improved overall maintainability by ensuring uniformity in function definitions and decorator usage.
    7171aff6
example_dequant_gemm_fp4_hopper.py 11.9 KB