"...git@developer.sourcefind.cn:yangql/composable_kernel.git" did not exist on "cd51732690641ae0ac76f90641246214f4a95bf9"
  • Lei Wang's avatar
    [AutoTune] Refactor AutoTuneArtifact to utilize kernel as context instead of profiler (#344) · f005db9f
    Lei Wang authored
    * [Enhancement] Update GEMM examples and autotuner for improved performance
    
    - Modified `example_gemm_intrinsics.py` to enhance matrix multiplication configurations, increasing warp sizes and adjusting data types for better performance.
    - Updated the kernel compilation process to utilize the new `tilelang.compile` method and improved latency measurement with the profiler.
    - Refactored `example_gemm.py` to include a new autotuning configuration and ensure consistency in latency checks against reference results.
    - Adjusted tensor supply generation in `tilelang/utils/tensor.py` to use `torch.randn` for better randomness in tensor initialization.
    - Enhanced the `JITContext` in `tilelang/autotuner/__init__.py` to replace the profiler with a kernel instance for performance measurement, improving the overall structure of the autotuner.
    
    * bug fix
    
    * fix
    
    * [Enhancement] Update convolution tests and profiling assertions
    
    - Added a random seed setting for reproducibility in convolution tests.
    - Removed several redundant convolution test cases to streamline the testing process.
    - Updated the assertion in the matrix multiplication profiling to include a maximum mismatched ratio for improved accuracy in results.
    - Enabled the main testing function for better test execution.
    
    * lint fix
    f005db9f
test_tilelang_kernel_convolution.py 5.23 KB