"example/vscode:/vscode.git/clone" did not exist on "46a675aa6f7f03d1b37fd350f62de1c35bb901f6"
[Refactor] Reorganize Thread Synchronization Steps to make sure global...
[Refactor] Reorganize Thread Synchronization Steps to make sure global synchronization can be correctly lowered (#521) * [Refactor] Reorganize Thread Synchronization Steps in OptimizeForTarget Function * Removed redundant thread synchronization steps for "global" and "shared" memory, streamlining the optimization process. * Reintroduced necessary synchronization for "shared" and "shared.dyn" after the injection of PTX async copy, ensuring correct memory access patterns. * Enhanced overall clarity and maintainability of the OptimizeForTarget function by restructuring the order of operations. * [Refactor] Reorder Thread Synchronization and PTX Async Copy in OptimizeForTarget Function * Removed redundant global thread synchronization step and adjusted the order of operations for shared memory synchronization. * Ensured that the PTX async copy injection occurs after the global thread sync, improving memory access validity. * Enhanced clarity and maintainability of the OptimizeForTarget function by restructuring synchronization steps.
Showing
Please register or sign in to comment