1. 13 Oct, 2025 1 commit
    • Yichen Yan's avatar
      [Build] Migrate to scikit-build-core (#939) · d89ba5b8
      Yichen Yan authored
      
      
      * cleanup
      
      * init
      
      * build first wheel that may not work
      
      * build cython ext
      
      * fix tvm build
      
      * use sabi
      
      * update rpath to support auditwheel
      
      * pass editible build
      
      * update ci
      
      * fix warnings
      
      * do not use ccache in self host runner
      
      * test local uv cache
      
      * test pip index
      
      * update lib search to respect new lib location
      
      * fix
      
      * update ci
      
      * enable cuda by default
      
      * update src map
      
      * fix
      
      * fix
      
      * fix
      
      * Generate version with backend and git information at build time
      
      * copy tvm_cython to wheels
      
      * fix tvm lib search
      
      * fmt
      
      * remove unused
      
      * auto detect ccache
      
      * add back backend-related files
      
      * remove jit cython adaptor to simplify code
      
      * fmt
      
      * fix ci
      
      * ci fix 2
      
      * ci fix 3
      
      * workaround metal
      
      * ci fix 4
      
      * fmt
      
      * fmt
      
      * Revert "ci fix 4"
      
      This reverts commit d1de8291c3e40927955f3ad3cf87a75c78813676.
      
      * tmp
      
      * fix metal
      
      * trivial cleanup
      
      * add detailed build-time version for cuda
      
      * add back mlc
      
      * Restore wheel info and other trivial updates
      
      * update
      
      * fix cuda
      
      * upd
      
      * fix metal ci
      
      * test for ga build
      
      * test for nvidia/cuda
      
      * test ubuntu 20
      
      * fix
      
      * fix
      
      * Do not use `uv build`
      
      * fix
      
      * fix
      
      * log toolchain version
      
      * merge wheel
      
      * update
      
      * debug
      
      * fix
      
      * update
      
      * skip rocm
      
      * update artifacts each
      
      * fix
      
      * fix
      
      * add mac
      
      * fix cache
      
      * fix cache
      
      * fix cache
      
      * reset and add comment
      
      * upd
      
      * fix git version
      
      * update deps
      
      * trivial update
      
      * use in-tree build dir and install to src to speedup editable build
      
      * Revert "use in-tree build dir and install to src to speedup editable build"
      
      This reverts commit 6ab87b05c5eed811210136b8dca4fc3677dd51f2.
      
      * add build-dir
      
      * update docs
      
      * remove old scrips
      
      * [1/n] cleanup scripts
      
      * [Lint]: [pre-commit.ci] auto fixes [...]
      
      * fix and update
      
      * wait for tvm fix
      
      * revert some tmp fix
      
      * fix
      
      * fix
      
      * spell
      
      * doc update
      
      * test cibuildwheel
      
      * fix and test macos on ci
      
      * Update .github/workflows/dist.yml
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      
      * fix
      
      * test ga event
      
      * cleanup
      
      * bump tvm to support api3
      
      * test final version
      
      * add cron
      
      * Update .github/workflows/dist.yml
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      
      * fix
      
      * test ccache for metal cibuildwheel
      
      * test newer macos
      
      * finish
      
      ---------
      Co-authored-by: default avatarpre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      d89ba5b8
  2. 11 Oct, 2025 1 commit
  3. 09 Oct, 2025 4 commits
    • Lei Wang's avatar
      [TileOp] Implement WGMMA for T.gemm_v2 (#813) · a13cde28
      Lei Wang authored
      * [Feature] Introduce WGMMA support and enhance GEMM layout handling
      
      - Added support for the WGMMA intrinsic in the TileLang framework, enabling efficient matrix multiplication on newer architectures.
      - Refactored GEMM layout functions to accept a boolean parameter for K dimension handling, improving flexibility in layout generation.
      - Updated layout inference logic to accommodate new WGMMA configurations and ensure compatibility with existing GEMM operations.
      - Enhanced Python bindings for layout functions, allowing for better integration and usability in user-defined operations.
      - Improved documentation for layout functions and GEMM operations to clarify usage and parameters.
      
      These changes enhance the performance and usability of GEMM operations, particularly for advanced architectures, while maintaining backward compatibility with existing implementations.
      
      * [Refactor] Clean up code formatting and enhance layout function readability
      
      - Improved code formatting across multiple files for better readability, including consistent indentation and line breaks.
      - Updated layout function signatures to enhance clarity, particularly in `gemm_layouts.cc`, `layout.cc`, and `layout.h`.
      - Refactored lambda functions in `builtin.cc` and `gemm_py.cc` for improved structure and maintainability.
      - Enhanced comments and documentation in layout-related files to clarify usage and parameters.
      
      These changes contribute to a cleaner codebase and improved maintainability of layout functions in the TileLang framework.
      
      * [Feature] Add descriptor initialization and offset manipulation for WGMMA
      
      - Introduced new TileLang builtins `initialize_descriptor` and `increase_descriptor_offset` to facilitate descriptor management for WGMMA operations.
      - Updated `builtin.cc` and `builtin.h` to define and document the new builtins, enhancing the framework's capabilities for descriptor handling.
      - Modified `codegen_cuda.cc` and `ptx.cc` to integrate the new builtins into the code generation process, ensuring proper assembly generation for WGMMA operations.
      - Enhanced the `GemmWGMMA` class to utilize the new descriptor functionalities, improving the efficiency of matrix multiplication operations.
      - Updated related tests and documentation to reflect the new features and ensure comprehensive coverage.
      
      These changes enhance the TileLang framework's support for advanced matrix operations on newer architectures, improving performance and usability.
      
      * [Refactor] Improve code formatting and readability in various files
      
      - Enhanced code formatting across multiple files for better readability, including consistent indentation and line breaks.
      - Updated function signatures and comments in `builtin.h`, `codegen_cuda.cc`, and `ptx.cc` to improve clarity.
      - Refactored descriptor initialization and offset manipulation functions in `builtin.py` and `wgmma_macro_generator.py` for improved structure.
      - Cleaned up unnecessary whitespace and improved alignment in `common.h` and `allocate.py`.
      
      These changes contribute to a cleaner and more maintainable codebase in the TileLang framework.
      
      * [Update] Update subproject commit and refactor layout function call
      
      - Updated the subproject commit for `cutlass` to indicate a dirty state.
      - Refactored the `UpdateAnalyzer` function in `layout.cc` to call `LayoutNode::getVarMap()` instead of `getVarMap()`, improving clarity and ensuring proper context for variable mapping.
      
      These changes enhance the maintainability and clarity of the layout handling in the TileLang framework.
      
      * support more data types
      
      * gemm_rs support
      
      * lint fix
      
      * wgmma wrapper
      
      * Remove debug logging for wgmma assembly code and refactor swizzle byte size calculations in wgmma macro generator. Enhanced handling of leading and stride byte offsets based on swizzle mode, improving clarity and performance in tensor core intrinsic emissions.
      
      * Refactor GEMM layout functions to replace 'kfactor' with 'k_inner' for improved clarity and consistency. Update includes necessary changes in error messages for Hopper and Sm100 layouts. Additionally, include a new header for CUTE utilities in common.h.
      
      * Comprehensively support WGMMA GEMM SS
      
      * remove debug print
      
      * lint fix
      
      * remove debug print
      
      * reduce bwd test shape
      
      * lint fix
      
      * clear cache for pytest
      
      * lint fix
      
      * Update sparse MLA examples to support SKV adjustment and correctness checks
      
      - Changed SKV parameter from 32768 to 8192 in sparse MLA backward and forward tests.
      - Added check_correctness parameter to test functions for validation of outputs.
      - Updated test cases to reflect new SKV values and correctness checks.
      
      * test fix
      
      * adjust test case
      
      * test fix
      
      * skip some test currently
      a13cde28
    • dependabot[bot]'s avatar
      [CI]: Bump actions/checkout from 2 to 5 (#953) · 10adb79f
      dependabot[bot] authored
      Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 5.
      - [Release notes](https://github.com/actions/checkout/releases)
      - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
      - [Commits](https://github.com/actions/checkout/compare/v2...v5
      
      )
      
      ---
      updated-dependencies:
      - dependency-name: actions/checkout
        dependency-version: '5'
        dependency-type: direct:production
        update-type: version-update:semver-major
      ...
      Signed-off-by: default avatardependabot[bot] <support@github.com>
      Co-authored-by: default avatardependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
      Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
      10adb79f
    • dependabot[bot]'s avatar
      [CI]: Bump astral-sh/setup-uv from 6 to 7 (#952) · b6f90d25
      dependabot[bot] authored
      Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6 to 7.
      - [Release notes](https://github.com/astral-sh/setup-uv/releases)
      - [Commits](https://github.com/astral-sh/setup-uv/compare/v6...v7
      
      )
      
      ---
      updated-dependencies:
      - dependency-name: astral-sh/setup-uv
        dependency-version: '7'
        dependency-type: direct:production
        update-type: version-update:semver-major
      ...
      Signed-off-by: default avatardependabot[bot] <support@github.com>
      Co-authored-by: default avatardependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
      b6f90d25
    • Xuehai Pan's avatar
  4. 07 Oct, 2025 1 commit