• Lei Wang's avatar
    [Refactor] Refactor for Better Layout Conflict Handling (#240) · 2a286ae6
    Lei Wang authored
    * [Feature] Add reduce_max functionality and corresponding tests
    
    * Introduced a new test file for the reduce_max operation in the tilelang language module.
    * Implemented the reduce_max functionality using T.prim_func, including local memory allocation and result copying.
    * Added tests for various input sizes and data types to ensure correctness of the reduce_max implementation.
    * Enhanced profiling assertions to validate the output against reference implementations.
    
    * Fix whitespace issues in reduce_max test file for improved readability
    
    * [Refactor] Update DebugOutput methods to return strings instead of void
    
    * Modified DebugOutput methods in LayoutNode, FragmentNode, and SwizzledLayoutNode to return std::string instead of void, enhancing usability for logging and debugging.
    * Updated corresponding header files to reflect the new return types.
    * Improved layout inference error messages by incorporating DebugOutput for better clarity in layout conflicts.
    
    * lint fix
    
    * Fix typo in matmul function: changed loop from T.Parallel to T.grid for correct parallel execution in webgpu code generation tests.
    
    * [Enhancement] Improve layout inference conflict handling in ParallelOp
    
    * Updated the layout inference logic in ParallelOp to better handle conflicts for local.fragment buffers.
    * Added checks to ensure that layout conflicts are reported only when both source and destination buffers are defined, improving clarity in error messages.
    * Enhanced the overall robustness of the layout inference process by addressing specific cases where conflicts may arise.
    
    * [Feature] Add IsEqual methods for layout comparison
    
    * Introduced IsEqual methods in LayoutNode, FragmentNode, and SwizzledLayoutNode to facilitate structural equality checks, allowing for optional index comparison.
    * Enhanced layout inference logic in Copy and ParallelOp to utilize the new IsEqual methods for better conflict detection in local.fragment layouts.
    * Improved error messages for layout conflicts to provide clearer guidance on potential issues.houm
    
    * [Refactor] Update profiler usage in benchmark_nsa_fwd.py and improve layout inference in elem.cc and parallel.cc
    
    * Modified the profiler call in benchmark_nsa_fwd.py to streamline latency measurement.
    * Updated layout inference logic in elem.cc and parallel.cc to use const pointers for FragmentNode, enhancing type safety and clarity.
    * Improved error messages in layout conflict checks to provide better guidance on potential issues.
    
    * [Refactor] Clean up pointer formatting in layout inference files
    
    * Standardized pointer formatting for FragmentNode in elem.cc and parallel.cc to improve code readability.
    * Minor adjustments to error message formatting in layout conflict checks for better clarity.
    2a286ae6
parallel.cc 10.8 KB