"vscode:/vscode.git/clone" did not exist on "cc95fef863380dc850dbbd407d5eb3fb024809ea"
  1. 31 Oct, 2025 1 commit
    • Lei Wang's avatar
      [FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi (#1108) · 10911e28
      Lei Wang authored
      
      
      * 3rdparty tvm bump
      
      * bump tvm into v0.22.0
      
      * lint fix
      
      * rebase tvm
      
      * Update submodule tvm to latest commit 3085bc4
      
      * Refactor: Update configuration retrieval in CopyNode and adjust test registration in tilelang
      
      * test fix
      
      * add requirement
      
      * atomic_fix
      
      * atomic_fix
      
      * phaseout py39
      
      * optimize
      
      * optimize
      
      * lint fix
      
      * do not clean cache
      
      * do not clean cache
      
      * [Minor] Minor update for Python versions and dependencies
      
      * [Lint] fix lint for py39
      
      * [Lint] fix lint for ROCm
      
      * [Build][CI] Sync CI changes from upstream/sdist
      
      * [Lint] fix lint for ROCm
      
      * [Build][CI] Update `repair-wheel-command`
      
      * [Minor] update abi3audit result format
      
      * [Lint] fix lint for ROCm
      
      * [BugFix] fix build
      
      * [Lint] fix lint for ROCm
      
      * [BugFix] set rpath for libtvm and libtvm_runtime
      
      * [Deps] pin apache-tvm-ffi version
      
      * [Build] set Python 3.9 Limited API for Cython target
      
      * [Build] set Python 3.9 Limited API for Cython target
      
      * [Deps] Restore Python 3.8 support
      
      * [Build] use `apache-tvm-ffi`'s `libtvm_ffi`
      
      * [BugFix] use `;` as delimiter for RPATH on macOS
      
      * [BugFix] use `--ignore-missing-dependencies` for `delocate-wheel`
      
      * [Build] support `sccache` if available
      
      * [Build] add CIBW import test
      
      * [Build][CI] enable ccache for CIBW on Linux
      
      * [BugFix] set rpath for libtvm and libtvm_runtime
      
      * Revert "[Build][CI] enable ccache for CIBW on Linux"
      
      This reverts commit cd9ab57bb5ddd2572c60bcbbebde81480a658fd3.
      
      * [CI] fix perfbench bot
      
      * [BugFix] use Python 3.9 to build wheel
      
      * [Minor] update perfbench bot envs
      
      * [BugFix] fix CIBW environment on Linux
      
      * [CI] skip import test on CentOS 7
      
      * [CI] use Python urllib to download file instead of Wget
      
      ---------
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@pku.edu.cn>
      10911e28
  2. 10 Sep, 2025 1 commit
    • Lei Wang's avatar
      [TileOp] Introduce a experimental python defined `T.gemm_v2` (#793) · 91a7bb2b
      Lei Wang authored
      * Refactor GEMM and GEMM-SP operations to enhance clarity and maintainability
      
      - Removed deprecated prime factorization functions from `gemm.cc` and `gemm_sp.cc`.
      - Introduced a new `GemmWarpPolicy` class to manage warp policy attributes and methods, improving encapsulation.
      - Updated reflection methods to include the new policy structure, ensuring proper registration and introspection capabilities.
      - Enhanced `GetArchInt` function in `utils.cc` for better readability and type safety.
      - Added new `gemm_v2` function in `gemm.py` for improved GEMM operation with additional parameters and checks.
      
      * Refactor GEMM and frontend legalize operations for improved clarity and functionality
      
      - Updated `gemm_py.h` to include the correct header for GEMM operations.
      - Renamed `FrontendLegalizer` class to `LetInliner` and updated related methods to reflect this change, enhancing code clarity.
      - Modified the pass function from `FrontendLegalize` to `LetInline` for better alignment with its purpose.
      - Updated test cases to utilize the new `gemm_v2` function and adjusted the testing framework for improved output and clarity.
      - Removed obsolete test file `test_tilelang_transform_frontend_legalize.py` to streamline the test suite.
      - Enhanced the `LowerAndLegalize` function to utilize the new `LetInline` pass, improving the overall transformation process.
      
      * Enhance CUDA code generation and testing for GEMM operations
      
      - Added indentation printing in `codegen_cuda.cc` for improved assembly code formatting.
      - Updated `test_tilelang_tilelibrary_gemm.py` to include additional GEMM test cases and shared memory allocation with specified scope.
      - Introduced new `matmul_sr` and `run_gemm_sr` functions for GEMM operations with shared and fragment memory layouts.
      - Refactored layout inference in `mma_macro_generator.py` to improve clarity and correctness in shared memory handling.
      - Enhanced `gemm/__init__.py` to support new GEMM operation combinations and layout inference logic.
      
      These changes improve the clarity, functionality, and testing coverage of GEMM operations in the TileLang framework.
      
      * Refactor GEMM layout and testing for improved clarity and functionality
      
      - Updated `gemm_layouts.cc` to enhance the layout generation logic for transposed and non-transposed GEMM operations.
      - Renamed and modified functions in `test_tilelang_tilelibrary_gemm.py` to reflect changes in GEMM function signatures and improve test coverage.
      - Introduced new GEMM operation combinations in `gemm/__init__.py` to support additional layouts and configurations.
      - Enhanced layout inference in `mma_layout.py` and `mma_macro_generator.py` for better handling of shared memory layouts.
      
      These changes improve the clarity, functionality, and testing coverage of GEMM operations in the TileLang framework.
      
      * Refactor GEMM layout and Python integration for improved functionality
      
      - Updated `gemm_layouts.cc` to correct the order of layout replication and repetition for transposed and non-transposed GEMM operations.
      - Enhanced `gemm_py.cc` to handle block realization more robustly, ensuring correct assignment of global symbols and block attributes.
      - Refactored `inject_pipeline.cc` to streamline buffer read/write region handling, improving clarity and maintainability.
      - Cleaned up test cases in `test_tilelang_tilelibrary_gemm.py` by removing unnecessary print statements and adjusting function calls for better test execution flow.
      
      These changes enhance the clarity, functionality, and robustness of GEMM operations and their testing in the TileLang framework.
      
      * Refactor GEMM layout and testing for improved clarity and functionality
      
      - Updated `gemm_layouts.cc` to enhance layout generation logic for transposed and non-transposed GEMM operations.
      - Improved block realization handling in `gemm_py.cc` for better assignment of global symbols.
      - Streamlined buffer read/write region handling in `inject_pipeline.cc` for clarity.
      - Enhanced test cases in `test_tilelang_tilelibrary_gemm.py` by adjusting function calls and adding new GEMM operation combinations.
      
      These changes improve the clarity, functionality, and robustness of GEMM operations and their testing in the TileLang framework.
      
      * tfloat32 support.
      
      * lint fix
      
      * lint fix
      
      * Refactor shared memory allocation in GEMM tests
      
      - Removed unnecessary scope specification in shared memory allocation for matrices A and B in `test_tilelang_tilelibrary_gemm.py`.
      - This change simplifies the allocation process and aligns with the updated GEMM function signatures.
      91a7bb2b
  3. 04 Sep, 2025 1 commit
    • Lei Wang's avatar
      [Refactor] Support python reflection for tile operators (#783) · 3cfefc8e
      Lei Wang authored
      * Implement Fill operator and related reflection methods in TileLang
      
      - Added Fill operator implementation in `fill.cc` and `fill.h` for element-wise filling of buffers.
      - Introduced reflection methods for Fill, AtomicAdd, Copy, Conv2DIm2Col, FinalizeReducer, Gemm, and Parallel operators to enhance introspection capabilities.
      - Updated relevant files to register reflection methods and ensure proper initialization in static blocks.
      - Removed outdated comments and unnecessary code in various operator files to improve clarity and maintainability.
      - Added new Python bindings for the Fill operator in `tilelang/ir/fill.py` and updated the module imports accordingly.
      
      * Refactor operator reflection methods and improve code clarity
      
      - Updated reflection methods for AtomicAdd, Copy, FinalizeReducer, Gemm, and Parallel operators to enhance readability by using `empty()` instead of size checks.
      - Consolidated static initialization blocks for various operators to a single line for improved consistency.
      - Cleaned up whitespace and formatting in multiple files to adhere to coding standards and improve maintainability.
      - Added new Python bindings for operators in the `tilelang/ir` module, ensuring proper registration and organization of imports.
      
      * Refactor GEMM and AtomicAdd operations for improved clarity
      
      - Updated the `GetArchInt` function in `atomic_add.cc` to use `std::string` and `std::stoi` for better readability and type safety.
      - Removed unnecessary variables and comments in `gemm_sp.cc` and `gemm.cc` to streamline the `ComputeWarpPartition` method.
      - Cleaned up the `layout_reducer.cc` file by removing unused variable declarations, enhancing code clarity.
      - Added import for the `ir` module in `tilelang/__init__.py` to ensure proper organization of module imports.
      
      * Remove deprecated operator files from the tilelang IR module
      
      - Deleted files for Fill, AtomicAdd, Copy, Gemm, GemmSP, FinalizeReducer, Parallel, Reduce, and Region operators to streamline the codebase.
      - This cleanup enhances maintainability by removing unused code and improving overall organization of the module.
      
      * Refactor imports in tilelang IR module for improved organization
      
      - Updated import statements in `tilelang/ir.py` to reflect changes in the TVM library structure, enhancing clarity and maintainability of the codebase.
      
      * lint fix
      
      * Refactor GEMM and GEMM-SP operations to enhance clarity and maintainability
      
      - Updated the `Gemm` and `GemmSP` classes to utilize a new `GemmWarpPolicy` object for warp partitioning, improving encapsulation and readability.
      - Removed deprecated `ComputeWarpPartition` methods and replaced them with calls to the new policy object, streamlining the code.
      - Cleaned up comments and unnecessary code in `gemm.cc`, `gemm_sp.cc`, and related header files to enhance overall clarity.
      - Introduced a new `GemmWarpPolicyNode` class to manage warp policy attributes and methods, facilitating better organization of related functionalities.
      - Updated reflection methods to include the new policy structure, ensuring proper registration and introspection capabilities.
      
      * Refactor Reduce operation to utilize ReduceType class for improved clarity and maintainability
      
      - Replaced multiple conditional checks for reduce types with a single ReduceType object, simplifying the code structure.
      - Introduced a new ReduceTypeNode class to encapsulate reduce type logic and methods, enhancing organization.
      - Updated MakeInitValue, MakeReduce, and Lower methods to leverage the new ReduceType class, improving readability.
      - Added Python bindings for the ReduceType class in tilelang IR module to ensure proper registration and usability.
      
      * comment
      
      * Refactor operator header files for improved readability
      
      - Cleaned up formatting and whitespace in `atomic_add.h`, `copy.h`, `fill.h`, `reduce.cc`, and `reduce.h` to enhance code clarity.
      - Consolidated comments and adjusted line breaks for better organization and maintainability across multiple operator definitions.
      
      * Refactor MakeReduce method in ReduceOpNode for clarity
      
      - Updated the parameter name in the MakeReduce method from `rhs` to `b` and assigned it to `rhs` for improved readability.
      - This change enhances the clarity of the method's purpose and aligns with the overall refactoring efforts in the Reduce operation.
      
      * Update Reduce operation type checks for consistency
      
      - Changed string comparisons for reduce types in the MakeReduce method from "abs_sum" to "abssum" and "abs_max" to "absmax" for uniformity.
      - This adjustment enhances the clarity and consistency of the reduce type handling in the codebase.
      3cfefc8e