"git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "8addb9d5529ebe6dac9136c8cac780150f8cedd2"
  • Yu Cheng's avatar
    [Dev][Bugfix] Fix bug in ThreadTagChecker; Add WgmmaSync rewriter and add MHA... · 0d873fcf
    Yu Cheng authored
    [Dev][Bugfix] Fix bug in ThreadTagChecker; Add WgmmaSync rewriter and add MHA WGMMA pipelined example (#128)
    
    * [Dev] Add RetNet Linear Attention example
    
    * [Dev] Add WgmmaSync rewriter for pipelined WGMMA operations and add MHA WGMMA pipelined example (FA3-like scheduling)
    
    This commit introduces a new transformation pass `RewriteWgmmaSync` to optimize warp group matrix multiply accumulate (WGMMA) operations in the TileLang compiler:
    
    - Implemented `WgmmaSyncRewriter` in `src/transform/wgmma_sync_rewriter.cc`
    - Added pass registration for `RewriteWgmmaSync`
    - Updated `tilelang/engine/phase.py` to include the new transformation pass
    - Updated `tilelang/transform/__init__.py` to expose the new pass
    
    The rewriter intelligently manages synchronization and dependencies between WGMMA operations, improving pipeline efficiency for complex matrix multiplication kernels.
    
    * [Bugfix] Fix bug in ThreadTagChecker for warp specialization
    
    Improve thread tag validation in warp specialized rewriter to prevent unintended transformations:
    - Add more precise checks for threadIdx.y and threadIdx.z
    - Validate thread extent to ensure only single-extent thread bindings are allowed
    - Prevent warp specialization for multi-extent thread bindings in y and z dimensions
    
    * lint
    
    * [CI] Add TMA descriptor attribute to transformed module in test case
    0d873fcf
phase.py 3.41 KB