"ts/webui/src/static/style/table.scss" did not exist on "e2d8cc1bcc83bb4485467d738beb19984e6b9cea"
  • Lei Wang's avatar
    [Bugfix] Put thread_extent into reduce (#640) · 156ff85e
    Lei Wang authored
    * [Enhancement] Update AllReduce operation to include thread offset in kernel generation
    
    - Modified the `ReduceOp::Lower` method to incorporate the thread offset in the AllReduce kernel generation for the sm_90 architecture.
    - This change improves the accuracy of thread management during reduction operations, enhancing performance on specific GPU architectures.
    
    * [Enhancement] Refactor thread offset handling in AllReduce kernel generation
    
    - Updated the `ReduceOp::Lower` method to streamline the handling of thread offset for AllReduce operations, ensuring consistent usage across different architectures.
    - This change enhances code clarity and maintains performance improvements for the sm_90 architecture by reducing redundancy in thread offset calculations.
    156ff85e
reduce.cc 12.6 KB