1. 21 Apr, 2026 1 commit
    • one's avatar
      Runner: Add local numactl GPU affinity support (#6) · 0993db75
      one authored
      - Add `numactl` support for local runner modes, including `cpunodebind`, `membind`, and `physcpubind`.
      - Add `gpu_affinity` resolution through `sb node topo --get gpu-numa-affinity --gpu-id`.
      - Add `sb node topo` support for GPU NUMA topology queries.
      - Update BW1000 config to use the new local `numactl` semantics.
      - Document the new `numactl` mode fields and limitations.
      0993db75
  2. 15 Apr, 2026 1 commit
  3. 04 Dec, 2025 1 commit
  4. 17 Nov, 2025 1 commit
    • Yuting Jiang's avatar
      Benchmarks: micro benchmarks - add --set_ib_devices option to auto-select IB... · c65ae567
      Yuting Jiang authored
      Benchmarks: micro benchmarks - add --set_ib_devices option to auto-select IB device by MPI local rank in ib validation (#733)
      
      **Description**
      add --set_ib_devices option to auto-select IB device by MPI local rank 
      
      
      **Major Revision**
      - Add a new CLI flag --set_ib_devices to automatically select irregular
      IB devices based on the MPI local rank.
      - When enabled, the benchmark queries available IB devices via
      network.get_ib_devices() and selects the device corresponding to
      OMPI_COMM_WORLD_LOCAL_RANK.
      - Fall back to existing --ib_dev behavior when the flag is not provided.
      
      **Minor Revision**
      - Add an env in network.get_ib_devices() to allow user to set the device
      name
      c65ae567
  5. 10 Oct, 2024 1 commit
  6. 08 Jan, 2024 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.10.0 (#607) · 2c88db90
      Yifan Xiong authored
      **Description**
      
      Cherry-pick bug fixes from v0.10.0 to main.
      
      **Major Revisions**
      
      * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
      * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
      * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
      * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
      * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
      * CI/CD - Add ndv5 topo file #597
      * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
      * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
      * Dockerfile - Bug fix for rocm docker build and deploy #598
      * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
      * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
      * Monitor - U...
      2c88db90
  7. 27 Nov, 2023 1 commit
    • guoshzhao's avatar
      Monitor - Add support for AMD GPU. (#580) · 028819b3
      guoshzhao authored
      **Description**
      Add AMD support in monitor.
      
      **Major Revision**
      - Add library pyrsmi to collect metrics.
      - Currently can get device_utilization, device_power, device_used_memory
      and device_total_memory.
      028819b3
  8. 05 Jul, 2023 1 commit
  9. 14 Apr, 2023 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.8.0 (#517) · 51761b3a
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.8.0 to main.
      
      **Major Revisions**
      
      * Monitor - Fix the cgroup version checking logic (#502)
      * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
      * Fix wrong torch usage in communication wrapper for Distributed
      Inference Benchmark (#505)
      * Analyzer: Fix bug in python3.8 due to pandas api change (#504)
      * Bug - Fix bug to get metric from cmd when error happens (#506)
      * Monitor - Collect realtime GPU power when benchmarking (#507)
      * Add num_workers argument in model benchmark (#511)
      * Remove unreachable condition when write host list (#512)
      * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
      * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
      * Docs - Upgrade version and release note (#508)
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      51761b3a
  10. 22 Mar, 2023 1 commit
  11. 13 Feb, 2023 1 commit
  12. 04 Jan, 2023 1 commit
    • Yang Wang's avatar
      Runner - Generate host groups file in mpi mode (#458) · 8e748d56
      Yang Wang authored
      **Major Revision**
      
      - Add an option for pattern to generate mpi_pattern.txt file if
      specified the path.
      - In mpi pattern, serial_index and parallel_index will add in each
      benchmark as environment variables.
      
      **Minor Revision**
      - Fix typo
      8e748d56
  13. 03 Jan, 2023 1 commit
  14. 30 Dec, 2022 1 commit
    • Yuting Jiang's avatar
      Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445) · 9dfefce3
      Yuting Jiang authored
      **Description**
      Add stdout logging util module and enable real-time logging flushing in executor
      
      **Major Revision**
      - Add stdout logging util module to redirect stdout into file log
      - enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log`
      - enable real-time log flushing in run_command of microbenchmarks through config `log_flushing`
      
      **Minor Revision**
      - add log_n_step args to enable regular step time log in model benchmarks 
      - udpate related docs
      9dfefce3
  15. 29 Dec, 2022 1 commit
  16. 29 Nov, 2022 1 commit
    • Yang Wang's avatar
      Runner - support 'pattern' in 'mpi' mode to run tasks in parallel (#430) · e4eeda0a
      Yang Wang authored
      * add mpi-parallels mode
      
      * update according to comments
      
      * fix and update doc
      
      * update
      
      * merge into 'mpi' mode
      
      * udpate according to comments
      
      * fix testcases
      
      * fix ansible
      
      * regard pattern as field
      
      * udpate
      
      * fix flake8 version
      
      * add flake8 range
      
      * remove map-by from host config
      
      * udpate comments
      e4eeda0a
  17. 06 Sep, 2022 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.6.0 (#409) · 63e9b2d1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.6.0 to main.
      
      **Major Revisions**
      
      * Enable latency test in ib traffic validation distributed benchmark (#396)
      * Enhance parameter parsing to allow spaces in value (#397)
      * Update apt packages in dockerfile (#398)
      * Upgrade colorlog for NO_COLOR support (#404)
      * Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
      * Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
      * Enhance timeout cleanup to avoid possible hanging (#405)
      * Auto generate ibstat file by pssh (#402)
      * Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
      * Docs - Upgrade version and release note (#407)
      * Docs - Fix issues in document (#408)
      Co-authored-by: default avatarYang Wang <yangwang1@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      63e9b2d1
  18. 13 Aug, 2022 1 commit
  19. 26 Jul, 2022 1 commit
    • Jie Zhang's avatar
      Support topo-aware IB performance validation (#373) · ef4d6574
      Jie Zhang authored
      
      
      * Support topo-aware IB performance validation
      
      Add a new pattern `topo-aware`, so the user can run IB performance
      test based on VM's topology information. This way, the user can
      validate the IB performance across VM pairs with different distance
      as a quick test instead of pair-wise test.
      
      To run with topo-aware pattern, user needs to specify three required
      (and two optional) parameters in YAML config file:
      --pattern	topo-aware
      --ibstat	path to ibstat output
      --ibnetdiscover	path to ibnetdiscover output
      --min_dist	minimum distance of VM pairs (optional, default 2)
      --max_dist	maximum distance of VM pairs (optional, default 6)
      
      The newly added topo_aware module then parses the topology
      information, builds a graph, and generates the VM pairs with
      the specified distance (# hops).
      
      The specified IB test will then be running across these
      generated VM pairs.
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Add description about topology aware ib traffic tests
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Add unit test to verify generated topology aware config file
      
      This commit adds unit test to verify the generated topology aware
      config file is correct. To do so, four new data files are added in
      order to invoke gen_topo_aware_config function to generate topology
      aware config file, then compares it with the expected config file.
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Fix lint issue on Azure pipeline
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      ef4d6574
  20. 05 Jul, 2022 1 commit
  21. 24 Jan, 2022 1 commit
  22. 15 Nov, 2021 1 commit
    • guoshzhao's avatar
      Benchmarks: Add Feature - Extend the device manager utility to support more functions. (#239) · cc70f9c1
      guoshzhao authored
      **Description**
      Rename `nvidia_helper` utility as `device_manager` module and support more functions:
      ```
      device_manager.get_device_count()
      device_manager.get_device_utilization(idx)
      device_manager.get_device_temperature(idx)
      device_manager.get_device_power_limit(idx)
      device_manager.get_device_memory(idx)
      device_manager.get_device_row_remapped_info(idx)
      device_manager.get_device_ecc_error(idx)
      ```
      cc70f9c1
  23. 31 Aug, 2021 1 commit
  24. 13 Jul, 2021 1 commit
  25. 09 Jul, 2021 1 commit
  26. 02 Jul, 2021 1 commit
    • Yifan Xiong's avatar
      Runner - Fetch benchmarks results on all nodes (#116) · fb7d4a73
      Yifan Xiong authored
      Fetch benchmarks results on all nodes, will rsync after each benchmark.
      The results directory structure on control node is as follows:
      
      ```
      outputs/
      └── datetime
          ├── nodes
          │   └── node-0
          │       ├── benchmarks
          │       │   ├── benchmark-0
          │       │   │   ├── rank-0
          │       │   │   │   └── results.json
          │       └── sb-exec.log
          ├── sb-run.log
          └── sb.config.yaml
      ```
      fb7d4a73
  27. 01 Jul, 2021 1 commit
  28. 23 Jun, 2021 1 commit
    • Yifan Xiong's avatar
      Bug bash - Fix bugs in multi GPU benchmarks (#98) · c0c43b8f
      Yifan Xiong authored
      * Add `sb deploy` command content.
      * Fix inline if-expression syntax in playbook.
      * Fix quote escape issue in bash command.
      * Add custom env in config.
      * Update default config for multi GPU benchmarks.
      * Update MANIFEST.in to include jinja2 template.
      * Require jinja2 minimum version.
      * Fix occasional duplicate output in Ansible runner.
      * Fix mixed color from Ansible and Python colorlog.
      * Update according to comments.
      * Change superbench.env from list to dict in config file.
      c0c43b8f
  29. 16 Jun, 2021 1 commit
    • Yifan Xiong's avatar
      Bug bash - Fix bugs and refine log in single GPU benchmarks (#97) · ddbc51a1
      Yifan Xiong authored
      Fix bugs and refine log in single GPU benchmarks:
      
      * Fix none framework issue
      * Fix empty parameter bug
      * Remove missed mobilenet_v3 models
      * Change benchmark registration log to debug level
      * Add pid in logging
      * Add missing benchmarks in default config
      * Fix deprecated logging warn
      ddbc51a1
  30. 01 Jun, 2021 1 commit
  31. 18 May, 2021 1 commit
  32. 11 May, 2021 1 commit
  33. 29 Mar, 2021 1 commit
    • Yifan Xiong's avatar
      Update logger (#28) · 0e2b2b08
      Yifan Xiong authored
      Update logger class.
      * add file handler along with stream handler
      * add colored formatter
      0e2b2b08
  34. 26 Mar, 2021 1 commit
  35. 12 Mar, 2021 1 commit
  36. 04 Mar, 2021 1 commit
  37. 24 Feb, 2021 1 commit