• Lei Wang's avatar
    [Bugfix] Support 16bits shfl_sync (#1169) · 54d4bd62
    Lei Wang authored
    * Add type-safe warp shuffle helpers for 16-bit float types in common.h
    
    - Introduced generic passthrough functions for warp shuffle operations: `shfl_xor_sync`, `shfl_down_sync`, `shfl_up_sync`, and `shfl_sync`.
    - Added specializations for `cutlass::half_t` and `cutlass::bfloat16_t` to ensure type safety during shuffle operations.
    - Updated `reduce.h` to utilize the new shuffle functions, enhancing code clarity and maintainability.
    
    * lint fix
    54d4bd62
common.h 14.7 KB