1. 14 Oct, 2025 4 commits
  2. 13 Oct, 2025 4 commits
    • Cunxiao Ni's avatar
      [CI] Removes redundant environment variable (#1020) · eb37e459
      Cunxiao Ni authored
      * [CI] Removes redundant environment variable
      Removes the `UV_INDEX_URL`
      
      * triggle CI
      
      * triggle CI
      
      * triggle CI
      
      * triggle CI
      eb37e459
    • Yichen Yan's avatar
      [Build] Migrate to scikit-build-core (#939) · d89ba5b8
      Yichen Yan authored
      
      
      * cleanup
      
      * init
      
      * build first wheel that may not work
      
      * build cython ext
      
      * fix tvm build
      
      * use sabi
      
      * update rpath to support auditwheel
      
      * pass editible build
      
      * update ci
      
      * fix warnings
      
      * do not use ccache in self host runner
      
      * test local uv cache
      
      * test pip index
      
      * update lib search to respect new lib location
      
      * fix
      
      * update ci
      
      * enable cuda by default
      
      * update src map
      
      * fix
      
      * fix
      
      * fix
      
      * Generate version with backend and git information at build time
      
      * copy tvm_cython to wheels
      
      * fix tvm lib search
      
      * fmt
      
      * remove unused
      
      * auto detect ccache
      
      * add back backend-related files
      
      * remove jit cython adaptor to simplify code
      
      * fmt
      
      * fix ci
      
      * ci fix 2
      
      * ci fix 3
      
      * workaround metal
      
      * ci fix 4
      
      * fmt
      
      * fmt
      
      * Revert "ci fix 4"
      
      This reverts commit d1de8291c3e40927955f3ad3cf87a75c78813676.
      
      * tmp
      
      * fix metal
      
      * trivial cleanup
      
      * add detailed build-time version for cuda
      
      * add back mlc
      
      * Restore wheel info and other trivial updates
      
      * update
      
      * fix cuda
      
      * upd
      
      * fix metal ci
      
      * test for ga build
      
      * test for nvidia/cuda
      
      * test ubuntu 20
      
      * fix
      
      * fix
      
      * Do not use `uv build`
      
      * fix
      
      * fix
      
      * log toolchain version
      
      * merge wheel
      
      * update
      
      * debug
      
      * fix
      
      * update
      
      * skip rocm
      
      * update artifacts each
      
      * fix
      
      * fix
      
      * add mac
      
      * fix cache
      
      * fix cache
      
      * fix cache
      
      * reset and add comment
      
      * upd
      
      * fix git version
      
      * update deps
      
      * trivial update
      
      * use in-tree build dir and install to src to speedup editable build
      
      * Revert "use in-tree build dir and install to src to speedup editable build"
      
      This reverts commit 6ab87b05c5eed811210136b8dca4fc3677dd51f2.
      
      * add build-dir
      
      * update docs
      
      * remove old scrips
      
      * [1/n] cleanup scripts
      
      * [Lint]: [pre-commit.ci] auto fixes [...]
      
      * fix and update
      
      * wait for tvm fix
      
      * revert some tmp fix
      
      * fix
      
      * fix
      
      * spell
      
      * doc update
      
      * test cibuildwheel
      
      * fix and test macos on ci
      
      * Update .github/workflows/dist.yml
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      
      * fix
      
      * test ga event
      
      * cleanup
      
      * bump tvm to support api3
      
      * test final version
      
      * add cron
      
      * Update .github/workflows/dist.yml
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      
      * fix
      
      * test ccache for metal cibuildwheel
      
      * test newer macos
      
      * finish
      
      ---------
      Co-authored-by: default avatarpre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
      Co-authored-by: default avatarXuehai Pan <XuehaiPan@outlook.com>
      d89ba5b8
    • Lei Wang's avatar
    • Yuqi Dong's avatar
      [Bugfix] Fix atomicadd auto vectorize identify var error (#883) · 340bfc50
      Yuqi Dong authored
      * update
      
      * update
      
      * update
      
      * update
      340bfc50
  3. 12 Oct, 2025 3 commits
  4. 11 Oct, 2025 7 commits
  5. 10 Oct, 2025 6 commits
    • Chaofan Lin's avatar
      [Bugfix] Fix dummy kernel compliation (#962) · 7913fb1d
      Chaofan Lin authored
      
      
      * [Bugfix] Fix visit EvaluateNode in BufferGemmCollector
      
      * address comment
      
      * lint
      
      * fix
      
      * Add TileLang SplitHostDevice pass and tighten issue 830 test names
      
      * lint fix
      
      * enhance for kernel value unpacking.
      
      ---------
      Co-authored-by: default avatarLeiWang1999 <leiwang1999@outlook.com>
      7913fb1d
    • Xiaoyu Zhang's avatar
      6031416f
    • Xuehai Pan's avatar
      [CI] add `pre-commit` integration (#955) · 8fe35402
      Xuehai Pan authored
      
      
      * chore: misc cleanup
      
      * feat: add pre-commit config
      
      * chore: update lint dependencies
      
      * style: fix lint issues
      
      * feat: add pre-commit hooks
      
      * fix: fix typos
      
      * chore: update .gitattributes
      
      * [Lint]: [pre-commit.ci] auto fixes [...]
      
      * docs: update CONTRIBUTING.md
      
      * chore: update default venv name
      
      * chore: revert and exclude CUDA files
      
      ---------
      Co-authored-by: default avatarpre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
      8fe35402
    • Lei Wang's avatar
      [Bugfix] Do not force inline let stmt (#947) · f8ae600c
      Lei Wang authored
      * remove debug print
      
      * Remove inline let expressions from the LowerAndLegalize function in phase.py
      
      * add test
      
      * Update sparse MLA examples to support SKV adjustment and correctness checks
      
      - Changed SKV parameter from 32768 to 8192 in sparse MLA backward and forward tests.
      - Added check_correctness parameter to test functions for validation of outputs.
      - Updated test cases to reflect new SKV values and correctness checks.
      
      * reduce test shape
      
      * Update documentation structure and refactor main function parameters in example_fusedmoe_tilelang.py
      
      - Added a new section for compiler internals in the documentation.
      - Refactored the main function in example_fusedmoe_tilelang.py to accept parameters for hidden dimensions, expert configurations, and batch/sequence sizes, improving flexibility and readability.
      
      * Update buffer access checks in merge_shared_memory_allocations.cc
      
      - Changed the condition for buffer access from less than (<) to less than or equal to (<=) to allow access at the same scope level.
      - Adjusted the logic for determining the access level when touching buffers to ensure correct handling of scope levels.
      
      * lint fix
      
      * Support pipeline with LetStmt
      
      * lint fix
      
      * • Fix LowerTileOp let handling to avoid LetInline dependency
      
        - inline let-bound BufferLoad nodes via resolver helpers and structured return
        - remap layouts/buffers using original data vars and only rewrite when needed
        - update pipeline planner to understand let-bound address_of buffers
        - document the new inline behaviour in docs/let_inline_fix.md
      
      * fix for wgmma pipeline with let binding
      
      * lint fix
      
      * test fix
      
      * reduce smem usage.
      
      * let binding enhancement
      
      * fix for dpgm
      
      * fix simplify
      
      * lint fix
      
      * use tilelang.Simplify instead of tir.Simplify
      
      * • Add TL_FORCE_LET_INLINE pass config and gate eager LetInline usage
      
        - register the new config in builtin headers/registration
        - add helper to pipeline enabling LetInline based on pass context
        - document LetStmt inlining controls and usage
      f8ae600c
    • Tong WU's avatar
      [Example] Add support for `bfloat16` and user-defined `sm_scale` in attention sink examples (#924) · 7cd0da99
      Tong WU authored
      
      
      * revert split+sum template for MHA backward
      
      * lint
      
      * Update example_mha_bwd.py
      
      * Update example_mha_bwd_wgmma_pipelined.py
      
      * Refactor attention sink examples to support bf16 and user-defined softmax scale
      
      * fix typos
      
      * Adding compile flags for fast math optimizations and enabling BF16 support in both GQA and MHA backward implementations.
      
      * Update backward configuration for GQA and MHA examples to align with flash attention
      
      * Refactor GQA backward implementation to improve atomic add performance
      
      * Allow for slightly larger numerical error for bf16
      
      * upd readme to show bf16 benchmark results
      
      * lint
      
      * fix ci and lint
      
      * fix comments and lint
      
      * refactor atomic add
      
      ---------
      Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
      7cd0da99
    • Xuehai Pan's avatar
      [Docs] add CODE_OF_CONDUCT.md (#965) · 8f07b9b0
      Xuehai Pan authored
      
      
      * [Docs] add CODE_OF_CONDUCT.md
      
      * Update CODE_OF_CONDUCT.md
      
      ---------
      Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
      8f07b9b0
  6. 09 Oct, 2025 10 commits
  7. 07 Oct, 2025 3 commits
  8. 06 Oct, 2025 3 commits