1. 03 Dec, 2025 1 commit
    • Lei Wang's avatar
      [Refactor] Generalize fp8 process (#1372) · 92121fc6
      Lei Wang authored
      * [Refactor] Update condition for benchmarking in example_gemv.py and simplify cached library path handling in sparse.py
      
      * [Enhancement] Extend support for float8 data types in GEMM operations
      
      - Updated GEMM operations to recognize additional float8 data types: `float8_e4m3fn` and `float8_e5m2fnuz`.
      - Refactored condition checks in `checkWgmma` methods to simplify float8 type handling.
      - Adjusted test cases to ensure compatibility with the new float8 types in tile language examples.
      
      * lint fix
      92121fc6
  2. 26 Nov, 2025 1 commit
  3. 24 Nov, 2025 1 commit
  4. 21 Nov, 2025 1 commit
  5. 02 Nov, 2025 1 commit
    • Lei Wang's avatar
      [Language] Expose `T.warpgroup_fence_operand` for nvcc code motion (#986) · aef0a6bb
      Lei Wang authored
      
      
      * remove debug print
      
      * pipeline fix
      
      * use the correct buffer access scope
      
      * rs support
      
      * warp warpgroup_fence_operand
      
      * fix
      
      * fp8 dtype ptx enhance
      
      * mma fix
      
      * TCGEN05 Interface
      
      * tcgen05 support
      
      * rebase
      
      * update
      
      * Enhance TCGEN05 support by adding new intrinsic operations and descriptors. Introduced `ptx_tcgen05_mma_ts` for tensor-memory to shared-memory instructions and `tcgen05_mma_arrive` for signaling barrier completion. Updated existing descriptors and code generation logic to accommodate these changes, ensuring compatibility with new instruction sets. Refactored related allocation functions and improved handling of shared memory descriptors.
      
      * lint fix
      
      * Refactor buffer reference handling in CUDA code generation and update test execution in tilelang. Ensure default annotations for unrolling are set correctly in TIR IR module.
      
      * wgmma fix
      
      ---------
      Co-authored-by: default avatarZhiwen Mo <zm125@ic.ac.uk>
      aef0a6bb