Unverified Commit dbb1235d authored by DevashishLal-CB's avatar DevashishLal-CB Committed by GitHub
Browse files

[Fix] illegal sync based on undefined behaviour (#9620)


Signed-off-by: default avatarDevashish Lal <devashish@rivosinc.com>
Co-authored-by: default avatarXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
parent ad26f298
......@@ -8,7 +8,7 @@
template <int THREADS_PER_SUBWARP>
__device__ __forceinline__ float GroupReduceMax(float val, const int tid) {
unsigned mask = 0xffff;
unsigned mask = threadIdx.x % 32 >= 16 ? 0xffff0000 : 0x0000ffff;
static_assert(
(THREADS_PER_SUBWARP & (THREADS_PER_SUBWARP - 1)) == 0 && THREADS_PER_SUBWARP <= 16 && THREADS_PER_SUBWARP >= 1,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment