"magic_pdf/git@developer.sourcefind.cn:wangsen/mineru.git" did not exist on "5b9fa871bdefe6f3ef32c9e977a023f0d565bb7e"
-
Lei Wang authored
* [Feature] Add reduce_max functionality and corresponding tests * Introduced a new test file for the reduce_max operation in the tilelang language module. * Implemented the reduce_max functionality using T.prim_func, including local memory allocation and result copying. * Added tests for various input sizes and data types to ensure correctness of the reduce_max implementation. * Enhanced profiling assertions to validate the output against reference implementations. * Fix whitespace issues in reduce_max test file for improved readability * [Refactor] Update DebugOutput methods to return strings instead of void * Modified DebugOutput methods in LayoutNode, FragmentNode, and SwizzledLayoutNode to return std::string instead of void, enhancing usability for logging and debugging. * Updated corresponding header files to reflect the new return types. * Improved layout inference error messages by incorporating DebugOutput for better clarity in layout conflicts. * lint fix * Fix typo in matmul function: changed loop from T.Parallel to T.grid for correct parallel execution in webgpu code generation tests. * [Enhancement] Improve layout inference conflict handling in ParallelOp * Updated the layout inference logic in ParallelOp to better handle conflicts for local.fragment buffers. * Added checks to ensure that layout conflicts are reported only when both source and destination buffers are defined, improving clarity in error messages. * Enhanced the overall robustness of the layout inference process by addressing specific cases where conflicts may arise. * [Feature] Add IsEqual methods for layout comparison * Introduced IsEqual methods in LayoutNode, FragmentNode, and SwizzledLayoutNode to facilitate structural equality checks, allowing for optional index comparison. * Enhanced layout inference logic in Copy and ParallelOp to utilize the new IsEqual methods for better conflict detection in local.fragment layouts. * Improved error messages for layout conflicts to provide clearer guidance on potential issues.houm * [Refactor] Update profiler usage in benchmark_nsa_fwd.py and improve layout inference in elem.cc and parallel.cc * Modified the profiler call in benchmark_nsa_fwd.py to streamline latency measurement. * Updated layout inference logic in elem.cc and parallel.cc to use const pointers for FragmentNode, enhancing type safety and clarity. * Improved error messages in layout conflict checks to provide better guidance on potential issues. * [Refactor] Clean up pointer formatting in layout inference files * Standardized pointer formatting for FragmentNode in elem.cc and parallel.cc to improve code readability. * Minor adjustments to error message formatting in layout conflict checks for better clarity.
2a286ae6