• Lei Wang's avatar
    [Enhancement] Remove DeReplicate during parallel loop layout inference (#430) · bb1a5fd8
    Lei Wang authored
    * [Refactor] Adjust layout inference calculations in Gemm and ParallelOp
    
    * Updated block size calculation in Gemm to account for the range of thread bounds, improving accuracy in layout inference.
    * Simplified layout conflict error messages in ParallelOp for better clarity, enhancing debugging experience.
    * Removed redundant buffer checks in ParallelOp layout inference logic, streamlining the code.
    
    * [Refactor] Clean up layout inference logic in Gemm and ParallelOp
    
    * Removed unnecessary warning log in Gemm related to WGMMA conditions, streamlining the layout inference process.
    * Commented out redundant checks in ParallelOp's layout inference, improving code clarity while maintaining functionality.
    * Enhanced error messages in ParallelOp to provide clearer context for layout conflicts, aiding in debugging efforts.
    
    * lint fix
    bb1a5fd8
parallel.cc 12.1 KB