"test/git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "4c1183c3a88db756f0ef2036c34e04e82554678c"
[Bugfix] Put thread_extent into reduce (#640)
* [Enhancement] Update AllReduce operation to include thread offset in kernel generation - Modified the `ReduceOp::Lower` method to incorporate the thread offset in the AllReduce kernel generation for the sm_90 architecture. - This change improves the accuracy of thread management during reduction operations, enhancing performance on specific GPU architectures. * [Enhancement] Refactor thread offset handling in AllReduce kernel generation - Updated the `ReduceOp::Lower` method to streamline the handling of thread offset for AllReduce operations, ensuring consistent usage across different architectures. - This change enhances code clarity and maintains performance improvements for the sm_90 architecture by reducing redundancy in thread offset calculations.
Showing
Please register or sign in to comment