1. 22 Apr, 2026 1 commit
  2. 21 Apr, 2026 6 commits
    • Hongtao Zhang's avatar
      Bugfix - gpu_stream: remove ROCm build support, require CUDA with NVML (#789) · 3c95714f
      Hongtao Zhang authored
      
      
      Summary
      
      The gpu_stream benchmark has NVIDIA-specific dependencies that prevent
      it from compiling on ROCm 6.3+. This change makes it CUDA-only,
      gracefully skipping the build with a warning on non-NVIDIA
        environments.
      
        Problem
      
      The gpu_stream benchmark fails to compile on ROCm 6.3+ due to multiple
      NVIDIA-specific dependencies:
      
      1. nvml.h — NVIDIA Management Library header, used for querying actual
      memory clock rates. No HIP equivalent. Referenced in gpu_stream.cu and
      gpu_stream_utils.hpp.
      2. cuda.h in headers — Three .hpp files (gpu_stream.hpp,
      gpu_stream_kernels.hpp, gpu_stream_utils.hpp) directly include <cuda.h>
      and <cuda_runtime.h>. These headers are not processed by hipify-perl
      (only
        .cu source files are), so they fail to resolve on ROCm.
      3. Deprecated hipDeviceProp_t struct fields — The code accesses
      memoryBusWidth, memoryClockRate, and ECCEnabled from the device
      properties struct. These fields were removed from hipDeviceProp_t in
      ROCm
          6.3, causing compilation errors after hipification.
      
      The existing ROCm path was marked as incomplete (# TODO: test for ROC)
      and was never fully functional on recent ROCm versions.
      
        Changes
      
      - Removed the non-functional ROCm/HIP build path from
      gpu_stream/CMakeLists.txt
      - When CUDA is not found, prints a warning and returns gracefully
      instead of attempting a broken hipify build or raising FATAL_ERROR
      - No changes to the NVIDIA/CUDA build path — it continues to work as
      before
      
        Impact
      
         - NVIDIA builds: No change — gpu_stream builds and installs normally
      - ROCm builds: gpu_stream is skipped with a warning message. Previously
      it would fail the entire make cppbuild step, blocking the Docker image
      build
      - Other benchmarks: Unaffected — build.sh continues to the next
      benchmark after gpu_stream returns
      Co-authored-by: default avatarHongtao Zhang <hongtaozhang@microsoft.com>
      3c95714f
    • Hongtao Zhang's avatar
      CI/CD - Fix setuptools-scm 10.x compatibility for Python 3.12 (#805) · 8c7e2be0
      Hongtao Zhang authored
      
      
      ## Description
      This affects all PRs running `python3 setup.py lint` on the Python 3.12
      CI job.
      
      ## Root Cause
      
      Comparing the last successful cpu-unit-test build (58939, Mar 25) with a
      recent failing build (58996, Apr 14), the Python 3.12 "Install
      dependencies" step shows:
      
      | Package | Successful (Mar 25) | Failing (Apr 14) |
      |---|---|---|
      | `setuptools-scm` | < 10.0 (no `vcs-versioning` dep) | 10.0.5 (requires
      `vcs-versioning`) |
      
      `setuptools-scm` 10.0.5 was released between the two runs and added
      `vcs-versioning` as a new dependency. The `setup_requires` mechanism in
      `setup.py` does not install transitive dependencies, so `vcs-versioning`
      is missing at runtime.
      
      The successful build lint log (Python 3.12): "ModuleNotFoundError: No
      module named 'vcs_versioning'"
      
      This affects all PRs running `python3 setup.py lint` on the Python 3.12
      CI job.
      
      ## Changes
      
      - Add `vcs_versioning` explicitly to `setup_requires` in `setup.py` so
      it is available when `setuptools-scm` is imported during `setup.py`
      execution.
      
      ## Testing
      
      Verified that `setuptools-scm` 10.0.5 declares `vcs-versioning` as a
      dependency, and the CI failure matches the missing transitive dependency
      pattern.
      Co-authored-by: default avatarHongtao Zhang <hongtaozhang@microsoft.com>
      8c7e2be0
    • one's avatar
      Benchmarks: Update gpu-hpcg metrics to encode process and problem shape (#8) · 0a1a15ea
      one authored
      * Update gpu-hpcg metrics to encode process and problem shape
      
      * Fix tests
      0a1a15ea
    • one's avatar
      SysInfo: Simplify smi commands · d7a56e0b
      one authored
      d7a56e0b
    • one's avatar
      Config: Update config files (#7) · 511807b7
      one authored
      - Add BW150 config
      - Update BW1000 config
      - Merge summary rules
      511807b7
    • one's avatar
      Runner: Add local numactl GPU affinity support (#6) · 0993db75
      one authored
      - Add `numactl` support for local runner modes, including `cpunodebind`, `membind`, and `physcpubind`.
      - Add `gpu_affinity` resolution through `sb node topo --get gpu-numa-affinity --gpu-id`.
      - Add `sb node topo` support for GPU NUMA topology queries.
      - Update BW1000 config to use the new local `numactl` semantics.
      - Document the new `numactl` mode fields and limitations.
      0993db75
  3. 20 Apr, 2026 3 commits
  4. 18 Apr, 2026 11 commits
  5. 17 Apr, 2026 4 commits
  6. 15 Apr, 2026 1 commit
  7. 02 Apr, 2026 9 commits
  8. 01 Apr, 2026 5 commits