"vscode:/vscode.git/clone" did not exist on "9ba8b4801a7f811334c4f1c24dd1ebc721f9a4a9"
  1. 01 Dec, 2025 1 commit
    • botbw's avatar
      [Language] support `T.gemm_sp_v2` on sm80 and sm89 (#1056) · 283a9a00
      botbw authored
      * [misc] add a cpp side wrapper for gemm_sp_py
      
      * [misc] typing
      
      * [IR] bind GemmSPWarpPolicy
      
      * [chore] add wrapper code
      
      * [IR] fix GemmSPWarpPolicy
      
      * [codegen] apply ptxas instructions
      
      * [intrinsic] add typical (unused) mma layout
      
      * [template] add uint16 debug func
      
      * [intrinsic] add b matrix layout
      
      * [gemm_sp] enable fp16/bf16 on sm8x
      
      * [layout] refactor fp16/bf16 layout
      
      * [gemm_sp] enable int8
      
      * [chore] update test case dtype
      
      * [gemm_sp] enable fp32
      
      * [layout] refactor layouts
      
      * [intrinsic] enable ldmatrix for mat A
      
      * [layout] enable ldsm for matrix b
      
      * [layout] add ldmatrix for fp32 and fp8
      
      * [chore] refine
      
      * [chore] refactor
      
      * [chore] add fp8 efactor
      
      * [chore] refactor
      
      * [chore] add remove negative zero util
      
      * [example] add a custom compress kernel
      
      * [chore] minor update
      
      * [test] refactor gemm_sp test
      
      * [refactor] make metadata layout func
      
      * [example] add option for using cutlass layout
      
      * [doc] add a gemm_sp doc
      
      * [doc] minor polish
      
      * [chore] remove unused
      
      * [bugfix] fix non replicate b case
      
      * [test] refactor
      
      * [chore] add a check
      
      * [bugfix] fix util bug
      
      * [wip] init a new test case for v2
      
      * [chore] minor refactor
      
      * [chore] minor update
      
      * [bugfix] enable 16bit rs
      
      * [language] enable rs
      
      * [language] enable gemm_sp_sr
      
      * [language] enable gemm_sp_rr
      
      * [test] enable more tests
      
      * [tvm] update ffi binding
      
      * [chore] remove print
      
      * [chore] fix benchmark script
      
      * [lint] precommit lint
      
      * [chore] apply feedback
      
      * [test] use arch 8.0
      
      * [chore] rollback ::ordered_metadata for backward compatibility
      
      * [bugfix] fix captialized
      
      * [example] keep gemm_sp on hopper
      
      * [test] fix no fp8 normal kernel
      
      * [test] reduce matmul size to satisfy accum error
      
      * [test] use cal_diff for assertion
      
      * [bugfix] expand float8 type
      
      * [lib] add make_int4 for short type
      
      * [language] add transpose E
      
      * [bugfix] fix wrong var
      
      * [format] format
      
      * [chore] refactor binding
      
      * [chore] fix wrong passing var
      283a9a00
  2. 27 Nov, 2025 1 commit
    • Lei Wang's avatar
      [Refactor] Improve assertion handling in CodeGenCHost and ArgBinder (#1352) · 1e92d11c
      Lei Wang authored
      * [Refactor] Improve assertion handling in CodeGenCHost and ArgBinder
      
      This commit refines the assertion message generation in CodeGenCHost by optimizing the handling of equality checks and reducing buffer size for error messages. Additionally, it enhances the ArgBinder by introducing a nullable guard mechanism for assertions, allowing for more precise error handling when binding arguments. The changes improve the clarity and efficiency of assertion handling across the codebase.
      
      * [Enhancement] Update matmul kernel and optimize argument binding
      
      This commit enhances the matmul kernel by introducing additional tensor parameters and refining the pipeline stages for improved performance. It also updates the argument binding mechanism to include a flag indicating whether buffers are used, enhancing the efficiency of buffer management. Furthermore, the optimization phase in the engine is improved by adding a simplification step, ensuring better performance and clarity in the generated code.
      
      * lint fix
      
      * [Enhancement] Add tensor checks documentation and improve argument binding assertions
      
      This commit introduces a new documentation page for host-side tensor checks, detailing the automatic validations performed by TileLang on kernel arguments. It enhances the ArgBinder by adding assertions for non-null pointers when arguments are used, improving error handling. Additionally, the optimization phase in the engine is updated to include a simplification step, ensuring better performance and clarity in the generated code.
      
      * [Enhancement] Update .gitignore and refine matmul kernel for improved performance
      
      This commit adds host checks logs to the .gitignore file to prevent unnecessary log files from being tracked. Additionally, it refines the matmul kernel by adjusting pipeline stages, updating tensor parameters, and enhancing argument handling for better performance. The changes also include improved error messages in the argument binding process, ensuring clearer diagnostics for users.
      
      * lint fix
      
      * lint fix
      
      * [Refactor] Simplify tensor_null_test function and remove ptr_null_test
      
      This commit refactors the tensor_null_test function by adding a with_bias parameter and removing the ptr_null_test function, which was previously unused. The run_test function is updated to reflect these changes, streamlining the testing process for tensor operations.
      
      * lint fix
      
      * fix
      1e92d11c
  3. 21 Oct, 2025 1 commit
    • Lei Wang's avatar
      [Target] Enhance target selection helpers and documentation (#1085) · 42c267e8
      Lei Wang authored
      * Improve target docs and helper messaging
      
        Commit Message:
      
        - add SUPPORTED_TARGETS metadata and expose describe_supported_targets()
        - relax target validation to accept option suffixes and upgrade error messages
        - document target usage and compute capability mapping in docs/get_started/targets.md
        - note preference for string targets when caching and link the new guide in docs/index.md
      
      * remove american english spelling
      42c267e8
  4. 11 Oct, 2025 1 commit
    • Lei Wang's avatar
      [Refactor] Refactor Pass `InjectFenceProxy` and expose some warp group... · ddfaac36
      Lei Wang authored
      [Refactor] Refactor Pass `InjectFenceProxy` and expose some warp group primitives in frontend (#977)
      
      * • InjectFenceProxy docs and tests
      
        - annotate proxy fence injector with context comments for async/generic detection
        - add compiler internals doc covering the pass mechanics and link it in docs index
        - repair fence proxy test by fixing descriptor init usage and fence counter logic
      
      * do not consider call_extern as async.
      
      * doc update.
      
      * reduce test size for sparse mla
      ddfaac36
  5. 10 Oct, 2025 1 commit
    • Lei Wang's avatar
      [Bugfix] Do not force inline let stmt (#947) · f8ae600c
      Lei Wang authored
      * remove debug print
      
      * Remove inline let expressions from the LowerAndLegalize function in phase.py
      
      * add test
      
      * Update sparse MLA examples to support SKV adjustment and correctness checks
      
      - Changed SKV parameter from 32768 to 8192 in sparse MLA backward and forward tests.
      - Added check_correctness parameter to test functions for validation of outputs.
      - Updated test cases to reflect new SKV values and correctness checks.
      
      * reduce test shape
      
      * Update documentation structure and refactor main function parameters in example_fusedmoe_tilelang.py
      
      - Added a new section for compiler internals in the documentation.
      - Refactored the main function in example_fusedmoe_tilelang.py to accept parameters for hidden dimensions, expert configurations, and batch/sequence sizes, improving flexibility and readability.
      
      * Update buffer access checks in merge_shared_memory_allocations.cc
      
      - Changed the condition for buffer access from less than (<) to less than or equal to (<=) to allow access at the same scope level.
      - Adjusted the logic for determining the access level when touching buffers to ensure correct handling of scope levels.
      
      * lint fix
      
      * Support pipeline with LetStmt
      
      * lint fix
      
      * • Fix LowerTileOp let handling to avoid LetInline dependency
      
        - inline let-bound BufferLoad nodes via resolver helpers and structured return
        - remap layouts/buffers using original data vars and only rewrite when needed
        - update pipeline planner to understand let-bound address_of buffers
        - document the new inline behaviour in docs/let_inline_fix.md
      
      * fix for wgmma pipeline with let binding
      
      * lint fix
      
      * test fix
      
      * reduce smem usage.
      
      * let binding enhancement
      
      * fix for dpgm
      
      * fix simplify
      
      * lint fix
      
      * use tilelang.Simplify instead of tir.Simplify
      
      * • Add TL_FORCE_LET_INLINE pass config and gate eager LetInline usage
      
        - register the new config in builtin headers/registration
        - add helper to pipeline enabling LetInline based on pass context
        - document LetStmt inlining controls and usage
      f8ae600c
  6. 24 Jul, 2025 1 commit
  7. 04 Jul, 2025 1 commit
    • Lei Wang's avatar
      [Doc] Phaseout Legacy documentations (#610) · d9ae74c6
      Lei Wang authored
      - Added a new entry in the README for the introduction of `T.gemm_sp` supporting 2:4 sparse tensor core.
      - Removed several outdated documentation files related to convolution, flash attention, and other tutorials to streamline the documentation structure.
      d9ae74c6
  8. 12 Apr, 2025 1 commit
  9. 27 Mar, 2025 1 commit
    • Wenhao Xie's avatar
      [Doc] Python API docs generation (#278) · 5501b31c
      Wenhao Xie authored
      * fix bug
      
      * update performance.py
      
      * update python api docs
      
      * test workflow
      
      * fix dependency
      
      * fix bug
      
      * fix
      
      * update correct git config
      
      * test workflow
      
      * clear cache
      
      * lint fix
      
      * fix exclude path
      5501b31c
  10. 13 Feb, 2025 1 commit