Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
tianlh
LightGBM-DCU
Commits
06741cca
"...git@developer.sourcefind.cn:tianlh/lightgbm-dcu.git" did not exist on "c53ac33b50760129da1b60886283d4918a21d5e7"
Commit
06741cca
authored
Sep 22, 2025
by
Jeff Daily
Browse files
update for ROCm 7 BC-breaking change to warpSize
parent
e461e868
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
20 additions
and
4 deletions
+20
-4
include/LightGBM/cuda/cuda_rocm_interop.h
include/LightGBM/cuda/cuda_rocm_interop.h
+20
-4
No files found.
include/LightGBM/cuda/cuda_rocm_interop.h
View file @
06741cca
/*!
/*!
* Copyright(C) 2023 Advanced Micro Devices, Inc. All rights reserved.
* Copyright(C) 2023 Advanced Micro Devices, Inc. All rights reserved.
*/
*/
#pragma once
#if defined(USE_CUDA) || defined(USE_ROCM)
#if defined(USE_CUDA) || defined(USE_ROCM)
#if defined(__HIP_PLATFORM_AMD__) || defined(__HIP__)
#if defined(__HIP_PLATFORM_AMD__)
// ROCm doesn't have __shfl_down_sync, only __shfl_down without mask.
// ROCm doesn't have __shfl_down_sync, only __shfl_down without mask.
// Since mask is full 0xffffffff, we can use __shfl_down instead.
// Since mask is full 0xffffffff, we can use __shfl_down instead.
#define __shfl_down_sync(mask, val, offset) __shfl_down(val, offset)
#define __shfl_down_sync(mask, val, offset) __shfl_down(val, offset)
#define __shfl_up_sync(mask, val, offset) __shfl_up(val, offset)
#define __shfl_up_sync(mask, val, offset) __shfl_up(val, offset)
// ROCm warpSize is constexpr and is either 32 or 64 depending on gfx arch.
#define WARPSIZE warpSize
// ROCm doesn't have atomicAdd_block, but it should be semantically the same as atomicAdd
// ROCm doesn't have atomicAdd_block, but it should be semantically the same as atomicAdd
#define atomicAdd_block atomicAdd
#define atomicAdd_block atomicAdd
// hipify
// hipify
#include <hip/hip_runtime.h>
#include <hip/hip_runtime.h>
#define cudaDeviceProp hipDeviceProp_t
#define cudaDeviceProp hipDeviceProp_t
...
@@ -41,7 +44,20 @@
...
@@ -41,7 +44,20 @@
#define cudaStreamDestroy hipStreamDestroy
#define cudaStreamDestroy hipStreamDestroy
#define cudaStream_t hipStream_t
#define cudaStream_t hipStream_t
#define cudaSuccess hipSuccess
#define cudaSuccess hipSuccess
#else // __HIP_PLATFORM_AMD__ || __HIP__
// warpSize is only allowed for device code.
// HIP header used to define warpSize as a constexpr that was either 32 or 64
// depending on the target device, and then always set it to 64 for host code.
static
inline
constexpr
int
WARP_SIZE_INTERNAL
()
{
#if defined(__GFX9__)
return
64
;
#else // __GFX9__
return
32
;
#endif // __GFX9__
}
#define WARPSIZE (WARP_SIZE_INTERNAL())
#else // __HIP_PLATFORM_AMD__
// CUDA warpSize is not a constexpr, but always 32
// CUDA warpSize is not a constexpr, but always 32
#define WARPSIZE 32
#define WARPSIZE 32
#endif
#endif
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment