"git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "26f6f07f39f3f6bd9d3d3a174762d0df47f7e827"
[Bugfix] Fix layout inference for free fragment buffer (#443)
* [Enhancement] Improve layout inference accuracy in ParallelOp (#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print
Showing
Please register or sign in to comment