• Lei Wang's avatar
    [JIT] Support Cython jit and make cython a default execution backend (#102) · 3471904f
    Lei Wang authored
    * [Feature] Add CTypes JIT kernel support for dynamic shapes and multi-stream execution
    
    - Enhance CtypesKernelAdapter to handle dynamic symbolic shapes
    - Add support for multi-stream kernel execution in CTypes backend
    - Implement dynamic shape handling in test_tilelang_jit_gemm_ctypes.py
    - Add symbolic shape utility function in tilelang.language
    - Update profiler to improve flexibility in benchmark selection
    
    * Remove redundant thread binding in GEMM kernel implementations
    
    - Remove unnecessary `thread_binding` line in GEMM kernel functions
    - Clean up code in `examples/gemm/README.md` and `testing/python/kernel/test_tilelang_kernel_int4_gemm_mma.py`
    - Enhance code readability by removing redundant thread binding annotation
    
    * Fix indentation in int4 GEMM kernel test file
    
    - Correct indentation for function calls in `test_tilelang_kernel_int4_gemm_mma.py`
    - Remove extra indentation in `mma_emitter.ldmatrix_a()` and `mma_emitter.ldmatrix_b()` calls
    - Improve code formatting for better readability
    
    * [Feature] Add Cython JIT kernel support for dynamic shapes and multi-stream execution
    
    - Implement CythonKernelAdapter to handle dynamic symbolic shapes
    - Add support for multi-stream kernel execution in Cython backend
    - Create comprehensive test suite for Cython GEMM kernel in test_tilelang_jit_gemm_cython.py
    - Update JITKernel to include "cython" as a valid execution backend
    - Add Cython-specific wrapper and library generation modules
    - Update .gitignore to exclude Cython cache directory
    - Modify setup.py to include Cython source files in package data
    
    * lint fix
    
    * [Refactor] Replace JITKernel with compile() function for kernel compilation
    
    - Add new `compile()` function in tilelang/jit/__init__.py as a wrapper for JITKernel
    - Update multiple test files and examples to use `tilelang.compile()` instead of `tilelang.JITKernel()`
    - Modify kernel adapters to support optional kernel-only source retrieval
    - Update `__init__.py` to import the new `compile()` function
    - Improve kernel source retrieval for different execution backends
    
    * lint fix
    
    * remove debug print
    
    * Add C/C++ compiler utility module and update Cython JIT kernel support
    
    - Introduce new `tilelang/contrib/cc.py` module with cross-platform C/C++ compiler utilities
    - Add functions to detect and retrieve system C/C++ compilers
    - Implement cross-compilation and shared library creation support
    - Update Cython JIT kernel to validate C++ compiler availability
    - Modify Cython adapter to use detected C++ compiler for library generation
    
    * Refactor float8 dtype mapping in tensor utility module
    
    - Move float8_dtype_map inside adapt_torch2tvm function
    - Simplify global scope by localizing the dtype mapping
    - Maintain existing functionality for converting torch float8 tensors to TVM ndarray
    
    * Refactor float8 dtype mapping in tensor utility module
    
    - Move float8_dtype_map inside adapt_torch2tvm function
    - Simplify global scope by localizing the dtype mapping
    - Maintain existing functionality for converting torch float8 tensors to TVM ndarray
    
    * revert
    
    * Enhance Cython JIT adapter with Cython compiler detection
    
    - Add `get_cython_compiler()` function to dynamically locate Cython executable
    - Update Cython adapter to use detected Cython compiler instead of hardcoded command
    - Raise an exception if no Cython compiler is found
    - Update requirements.txt to specify minimum PyTorch version (>=2.2.0)
    
    * Fix Cython kernel wrapper stream handling and type annotations
    
    - Update stream parameter type to int64_t for better compatibility
    - Directly use torch.cuda.current_stream().cuda_stream instead of casting
    - Improve type safety and precision in Cython kernel wrapper
    3471904f
setup.py 21.4 KB