• Lei Wang's avatar
    [Refactor] Phaseout LLVM Dependency by Making it Optional (#247) · f2e99180
    Lei Wang authored
    * remove llvm build
    
    * [Refactor] Update kernel compilation and profiling in examples
    
    - Replaced `tilelang.lower` with `tilelang.compile` in multiple example scripts to streamline kernel compilation.
    - Updated profiling calls to utilize the new `get_profiler` method, enhancing performance measurement consistency.
    - Adjusted assertions and benchmarking methods to align with the new profiling structure across various examples, ensuring correctness and clarity in performance evaluations.
    
    * lint fix
    
    * License Update
    
    * [Refactor] Improve code formatting and documentation in CUDA header and HIP runtime files
    
    - Adjusted formatting in `cuda.h` for better readability, including alignment of comments and struct fields.
    - Cleaned up whitespace and improved comment clarity in `rt_mod_hip.cc` to enhance code maintainability.
    
    * [Refactor] Enhance formatting and clarity in CUDA header and HIP runtime files
    
    - Improved comment alignment and readability in `cuda.h`.
    - Cleaned up whitespace and formatting in `rt_mod_hip.cc` to enhance maintainability.
    
    * lint fix
    
    * lint fix
    
    * lint fix
    
    * lint fix
    
    * fix
    
    * License update
    
    * [Enhancement] Update JITKernel to use artifact for kernel source
    
    - Assigned the generated artifact to `self.artifact` for better management.
    - Updated kernel source references to use `artifact.kernel_source` for consistency in execution backend handling.
    
    * lint fix
    
    * Add @tilelang.testing.requires_llvm decorator to vectorization tests
    
    * Enhance setup.py and env.py for library management
    
    - Added functionality to remove original files after copying in CMakeBuild.
    - Updated TVM_LIBRARY_PATH in env.py to include the PyPI build library path for better integration.
    
    * Refactor TVM_LIBRARY_PATH assignment for improved readability in env.py
    
    * Refactor CMakeBuild file handling in setup.py
    
    - Added a check to ensure the target library directory exists before copying .so files.
    - Improved the logic for creating the target directory and copying files to enhance robustness.
    
    * bugfix
    
    * Rename BuildTLDebug to BuildTileLangCUDAWithoutCompile and update registration. Add @tilelang.testing.requires_llvm decorator to multiple tests for LLVM requirement.
    
    * lint fix
    
    * Enhance TileLang code generation by adding support for device code generation without compilation. Updated `host_codegen` and `device_codegen` functions to include new transformations and registration for `tilelang_hip_without_compile`. Refactored JIT kernel adapters to accommodate host and device modules, improving overall integration and flexibility.
    
    * lint fix
    
    * Add support for C target in device code generation
    
    - Updated `device_codegen_without_compile` to include handling for the C target by registering the `tilelang_cpp` function.
    
    * [Enhancement] Implement auto-clear cache feature based on environment variable
    
    * Added TILELANG_CLEAR_CACHE environment variable to control cache clearing.
    * Updated CI workflow to set TILELANG_CLEAR_CACHE during testing.
    * Modified cache initialization to clear cache if TILELANG_CLEAR_CACHE is set to true.
    
    * [Refactor] Update kernel invocation and import paths in tests and cache
    
    * Changed kernel invocation in `test_tilelang_kernel_dequantize_gemm.py` to return the result.
    * Updated import statements in `test_tilelang_kernel_int4_gemm_mma.py` to use `bitblas` instead of `tilelang`.
    * Refactored paths for artifact and parameters in `kernel_cache.py` for better maintainability.
    
    * [Refactor] Clean up whitespace and improve code formatting in kernel_cache.py
    
    * Removed unnecessary blank lines and adjusted spacing for better readability in the KernelCache class.
    * Enhanced overall code formatting to align with project standards.
    
    * [Enhancement] Add bfloat16 test case and improve kernel caching logic
    
    * Introduced a new test case for bfloat16 matrix multiplication in `test_tilelang_kernel_gemm_mma_intrinsic.py`.
    * Updated `KernelCache` to handle multiple kernel source files and improve error handling during saving and loading.
    * Refactored `JITKernel` to support instantiation from a database, enhancing flexibility in kernel management.
    * Adjusted `CtypesKernelAdapter` and `CythonKernelAdapter` to utilize the new kernel loading mechanism from the database.
    * Improved code formatting and readability across several files.
    
    * lint fix
    
    * Update bfloat16 matrix multiplication test case to use larger dimensions for improved coverage
    f2e99180
runtime.h 490 Bytes