"ts/vscode:/vscode.git/clone" did not exist on "3a1416cb36bf2402b03ef983ff53ffb0ac732ed9"
-
Lei Wang authored
* [Enhancement] Refactor GEMM operations for improved warp partitioning and target instruction handling - Introduced a new `GetGemmInst` method to determine the appropriate GEMM instruction based on block size and target architecture. - Updated `ComputeWarpPartition` to accept the GEMM instruction type, enhancing flexibility in warp partitioning logic. - Added `TargetGetWarpSize` utility to streamline warp size retrieval based on target architecture. - Refactored layout inference and lowering methods to utilize the new GEMM instruction handling, improving clarity and maintainability of the codebase. * bug fix * test fix * lint fix * phase out Canonialize * add option --expt-relaxed-constexpr * [Enhancement] Introduce tilelang intrinsic operations for GEMM - Added `tl_gemm` and `tl_gemm_sp` built-in operations to support general and sparse matrix multiplication in tilelang. - Updated the lowering logic in `Gemm` and `GemmSP` to utilize the new tilelang operations. - Enhanced CUDA and HIP code generation to handle the new GEMM operations, ensuring proper argument validation and external call printing. - Implemented shared memory alignment planning for GEMM operations to optimize performance on supported architectures. * lint fix * lint fix * test fix * test fix * rebase * Update builtin.cc
17fafc1b