-
Lei Wang authored
* [Refactor] Remove cache existence check in kernel saving logic - Eliminated redundant checks for existing cache paths in `AutotuneResult` and `AutoTunerCache` classes, simplifying the kernel saving process. - Ensured that the cache directory is always created before saving kernel source code, improving reliability in kernel storage. * [Enhancement] Improve input tensor compatibility checks in AutoTuner - Enhanced the input tensor caching logic in the AutoTuner class to ensure compatibility between cached tensors and newly generated tensors during configuration trials. - Added detailed logging to warn users about potential mismatches in tensor properties, including shape and dtype, when caching is enabled. - Implemented a mechanism to regenerate input tensors if compatibility issues are detected, improving the robustness of the autotuning process. * [Refactor] Update L2 persistent map initialization in CUDA wrapper - Adjusted the L2 persistent map initialization function to use a consistent size parameter for cache limits and byte counts, improving clarity and reducing potential errors in memory management. - Simplified the formatting of the initialization function to enhance readability and maintainability of the code. * Update tilelang/autotuner/__init__.py Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
cce6aed8