Commits · f57d86f4d1fdadffb08e7d882a94718df123bb46 · tsoc / superbenchmark

15 Apr, 2026 1 commit
- Update GPU vendors · f57d86f4
  one authored Apr 15, 2026
  
  f57d86f4
17 Nov, 2025 1 commit

Benchmarks: micro benchmarks - add --set_ib_devices option to auto-select IB... · c65ae567

Yuting Jiang authored Nov 17, 2025

Benchmarks: micro benchmarks - add --set_ib_devices option to auto-select IB device by MPI local rank in ib validation (#733)

**Description**
add --set_ib_devices option to auto-select IB device by MPI local rank 


**Major Revision**
- Add a new CLI flag --set_ib_devices to automatically select irregular
IB devices based on the MPI local rank.
- When enabled, the benchmark queries available IB devices via
network.get_ib_devices() and selects the device corresponding to
OMPI_COMM_WORLD_LOCAL_RANK.
- Fall back to existing --ib_dev behavior when the flag is not provided.

**Minor Revision**
- Add an env in network.get_ib_devices() to allow user to set the device
name

c65ae567

04 Dec, 2023 1 commit

Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation (#581) · 9ae8c670

Yuting Jiang authored Dec 04, 2023

**Description**
Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in
ib-validation

**Major Revision**
- Support cpu-gpu and gpu-cpu in ib-validation


**Minor Revision**
- support multi msg size, multi direction, multi ib commands in
ib-validation

9ae8c670

30 Dec, 2022 1 commit
- Benchmarks - Support `pair-wise` pattern in IB validation benchmark (#453) · f2634d86
  Yang Wang authored Dec 30, 2022
```
**Description**
* Reuse `gen_pair_wise_config` in micro-benchmark
```
  f2634d86
06 Sep, 2022 1 commit

Release - SuperBench v0.6.0 (#409) · 63e9b2d1

Yifan Xiong authored Sep 06, 2022



**Description**

Cherry-pick bug fixes from v0.6.0 to main.

**Major Revisions**

* Enable latency test in ib traffic validation distributed benchmark (#396)
* Enhance parameter parsing to allow spaces in value (#397)
* Update apt packages in dockerfile (#398)
* Upgrade colorlog for NO_COLOR support (#404)
* Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
* Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
* Enhance timeout cleanup to avoid possible hanging (#405)
* Auto generate ibstat file by pssh (#402)
* Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
* Docs - Upgrade version and release note (#407)
* Docs - Fix issues in document (#408)
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

63e9b2d1

26 Jul, 2022 1 commit

Support topo-aware IB performance validation (#373) · ef4d6574

Jie Zhang authored Jul 26, 2022



* Support topo-aware IB performance validation

Add a new pattern `topo-aware`, so the user can run IB performance
test based on VM's topology information. This way, the user can
validate the IB performance across VM pairs with different distance
as a quick test instead of pair-wise test.

To run with topo-aware pattern, user needs to specify three required
(and two optional) parameters in YAML config file:
--pattern	topo-aware
--ibstat	path to ibstat output
--ibnetdiscover	path to ibnetdiscover output
--min_dist	minimum distance of VM pairs (optional, default 2)
--max_dist	maximum distance of VM pairs (optional, default 6)

The newly added topo_aware module then parses the topology
information, builds a graph, and generates the VM pairs with
the specified distance (# hops).

The specified IB test will then be running across these
generated VM pairs.
Signed-off-by: Jie Zhang <jessezhang1010@gmail.com>

* Add description about topology aware ib traffic tests
Signed-off-by: Jie Zhang <jessezhang1010@gmail.com>

* Add unit test to verify generated topology aware config file

This commit adds unit test to verify the generated topology aware
config file is correct. To do so, four new data files are added in
order to invoke gen_topo_aware_config function to generate topology
aware config file, then compares it with the expected config file.
Signed-off-by: Jie Zhang <jessezhang1010@gmail.com>

* Fix lint issue on Azure pipeline
Signed-off-by: Jie Zhang <jessezhang1010@gmail.com>

ef4d6574

25 Jul, 2022 1 commit

Fix unexpected base conversion when the result value is negative (#377) · 5d448eed

Yang Wang authored Jul 25, 2022

Fix an unexpected result value (`-0.125`) issue in ib traffic benchmark when encountering `-1` in raw output
* Check if the value is valid before the base conversion
* Add a test case to cover this situation

5d448eed

09 Jul, 2022 1 commit

Fix issues in ib validation benchmark (#370) · b2875179

Yifan Xiong authored Jul 09, 2022

Fix several issues in ib validation benchmark:
* continue running when timeout in the middle, instead of aborting whole mpi process
* make timeout parameter configurable, set default to 120 seconds
* avoid mixture of stdio and iostream when print to stdout
* set default message size to 8M which will saturate ib in most cases
* fix hostfile path issue so that it can be auto found in different cases

b2875179

24 Jun, 2022 1 commit

Support multiple IB/GPU in ib validation (#363) · bfaa1c83

Yifan Xiong authored Jun 24, 2022

**Description**

Support multiple IB/GPU devices run simultaneously in ib validation benchmark.

**Major Revisions**
- Revise ib_validation_performance.cc so that multiple processes per node could be used to launch multiple perftest commands simultaneously. For each node pair in the config, number of processes per node will run in parallel.
- Revise ib_validation_performance.py to correct file paths and adjust parameters to specify different NICs/GPUs/NUMA nodes.
- Fix env issues in Dockerfile for end-to-end test.
- Update ib-traffic configuration examples in config files.
- Update unit tests and docs accordingly.

Closes #326.

bfaa1c83

01 Apr, 2022 1 commit

Benchmarks: Add Feature - Provide option to save raw data into file. (#333) · 6d895da8

guoshzhao authored Apr 01, 2022

**Description**
Use config `log_raw_data` to control whether log the raw data into file or not. The default value is `no`. We can set it as `yes` for some particular benchmarks to save the raw data into file, such as NCCL/RCCL test.

6d895da8

30 Dec, 2021 1 commit

Release - SuperBench v0.4.0 (#278) · ff563b66

Yifan Xiong authored Dec 30, 2021



__Description__

Cherry-pick  bug fixes from v0.4.0 to main.

__Major Revisions__

* Bug - Fix issues for Ansible and benchmarks (#267)
* Tests - Refine test cases for microbenchmark (#268)
* Bug - Build openmpi with ucx support in rocm dockerfiles (#269)
* Benchmarks: Fix Bug - Fix fio build issue (#272)
* Docs - Unify metric and add doc for cublas and cudnn functions (#271)
* Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274)
* Bug - Fix bug of detecting if gpu_index is none (#275)
* Bug - Fix bugs in data diagnosis (#273)
* Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270)
* Benchmarks: Configuration - Update inference and network benchmarks in configs (#276)
* Docs - Upgrade version and release note (#277)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

ff563b66

09 Dec, 2021 1 commit
- Benchmarks: Unify metric names of benchmarks (#252) · 9f56b219
  Yuting Jiang authored Dec 09, 2021
```
**Description**
Unify metric names of benchmarks.
```
  9f56b219
09 Nov, 2021 1 commit

Benchmarks: Add Benchmark - Add ib traffic validation distributed benchmark (#215) · 54919424

Yuting Jiang authored Nov 10, 2021

**Description**
Add ib traffic validation distributed benchmark.

**Major Revision**
- Add ib traffic validation distributed benchmark, example and test

54919424