Commit 3b56cf85 authored by lijian's avatar lijian
Browse files

fix: defer rocshmem env var reading and align RDMA buffer size



- Update rocshmem submodule to commit with lazy env var reading,
  so ROCSHMEM_HEAP_SIZE set via os.environ can take effect before
  C++ library initialization
- Align num_rdma_bytes to 2GB boundary using bit shifts instead of
  magic numbers
- Update build.sh to checkout the new rocshmem commit
Signed-off-by: default avatarlijian <34831075+lijian0711@users.noreply.github.com>
parent 1dd64f3c
......@@ -144,7 +144,7 @@ build_rocshmem()
cd third-party/rocshmem/
git config --global --add safe.directory .
if [ "$BUILD_SHCA" == "ON" ]; then
git checkout 0a05c14ebd47034ed049afb7b896536fcd91a9a7
git checkout b118f9ea536d873fef9d411180e44c71900a0a32
fi
if [ ! -d "build" ]; then
mkdir -p build
......
......@@ -122,9 +122,8 @@ class Buffer:
self.gda_num_qps_per_pe = max(int(os.environ.get('ROCSHMEM_GDA_NUM_QPS_PER_PE_DEFAULT_CTX', str(num_qps_per_rank))), num_qps_per_rank * self.group_size)
os.environ["ROCSHMEM_GDA_NUM_QPS_DEFAULT_CTX"] = str(self.gda_num_qps_per_pe)
if self.num_rdma_bytes > 1073741824:
multiple = 2147483648
rocshmem_num_rdma_bytes = ((self.num_rdma_bytes + multiple - 1) // multiple) * multiple
if self.num_rdma_bytes > (1 << 30):
rocshmem_num_rdma_bytes = ((self.num_rdma_bytes + (1 << 31) - 1) // (1 << 31)) * (1 << 31)
os.environ["ROCSHMEM_HEAP_SIZE"] = str(rocshmem_num_rdma_bytes)
if self.group_size <= 8:
os.environ["ROCSHMEM_BACKEND"] = "ipc"
......
rocshmem @ da2fa573
Subproject commit bea1a2e7dc6abaa40d4def4800bb1eef52735e2b
Subproject commit da2fa573cca9c363e5ff7e399a5030d1608657a6
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment