Commits · 8b805d90fe49a8452b330eabb72f6fb9b1fa81cd · tsoc / superbenchmark

28 Jan, 2026 1 commit

CI/CD - Fix Image build for cuda11.1.1 (#771) · 8b805d90

Hongtao Zhang authored Jan 28, 2026



**Description**

- When building the CUDA 11.1.1 image, pip (Python 3.8) cannot find a
pre-built wheel for the latest wandb release (v0.23.1). As a result, pip
attempts to build wandb from source. However, the build fails because
the image does not have Go installed, which is required for building
wandb from source. Then the error appears.

**Solution**

- For the CUDA 11.1.1 build, install the required build tools (e.g., Go,
Rust, and Cargo) needed for wandb.

---------
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

8b805d90

06 Nov, 2025 1 commit
- Fix pipelines - Update mlc version in dockerfiles from v3.11 to v3.12 (#752) · 25db1115
  WenqingLan1 authored Nov 06, 2025
```
Updated mlc wget link in dockerfiles.

---------
Co-authored-by: guoshzhao <guzhao@microsoft.com>
```
  25db1115
30 Apr, 2025 1 commit

CI/CD - Update OS of runner to the latest. (#702) · 330c68aa

Hongtao Zhang authored Apr 30, 2025



- Upgrade OS of github runner used by lint to the latest.
- Add symbolic link for clang-format to version 14.
- Update importlib_metadata version since it is too old (inside
nvcr.io/nvidia/pytorch:20.12-py3) and failed the 11.1 build.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

330c68aa

21 Nov, 2024 1 commit

Benchmarks: micro benchmarks - add nvbandwidth build (#665) · c8c52eb2

Hongtao Zhang authored Nov 21, 2024



**Description**
Add nvbandwidth build to repo

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

c8c52eb2

10 Oct, 2024 1 commit

Release - SuperBench v0.11.0 (#654) · 949f9cb4

Yuting Jiang authored Oct 10, 2024



**Description**
Cherry pick bug fixes from v0.11.0 to main

**Major Revision**
* #645 
* #648 
* #646 
* #647 
* #651 
* #652 
* #650

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

949f9cb4

18 Apr, 2024 1 commit
- Dockerfile - Upgrade mlc to v3.11 (#620) · dc3846cb
  Yuting Jiang authored Apr 18, 2024
```
**Description**
Upgrade mlc to v3.11.
```
  dc3846cb
07 Dec, 2023 1 commit
- Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark (#582) · dd5a6329
  Yuting Jiang authored Dec 07, 2023
```
**Description**
Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
```
  dd5a6329
22 Nov, 2023 1 commit

Analyzer - Generate baseline given results from multiple nodes. (#575) · 9f4880cb

guoshzhao authored Nov 22, 2023



**Description**
Generate baseline given results from multiple nodes. 

**Major Revision**
- Add sub command `sb result generate-baseline`
- Add UT and docs

---------
Co-authored-by: 454314380 <454314380@qq.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

9f4880cb

23 Oct, 2023 1 commit

Dockerfile - update mlc version into 3.10 for cuda and rocm dockerfiles (#562) · d246bab4

Yuting Jiang authored Oct 23, 2023



**Description**
Update mlc version into 3.10 for cuda and rocm dockerfiles to be
consistent with cuda12 dockerfile
Co-authored-by: yukirora <yuting.jiang@microsoft.com>

d246bab4

22 Aug, 2023 1 commit
- Benchmarks: micro benchmark - source code for evaluating NVDEC decoding performance (#560) · 27a10811
  Yuting Jiang authored Aug 22, 2023
```
**Description**
source code for evaluating NVDEC decoding performance.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
```
  27a10811
21 Mar, 2023 1 commit

Adding HPL benchmark (#482) · 655bd0aa

rafsalas19 authored Mar 21, 2023



**Description**

- Adding HPL benchmark

---------
Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net>
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>

655bd0aa

13 Feb, 2023 1 commit

Adding Stream Benchmark (#473) · 32896ca4

rafsalas19 authored Feb 13, 2023



**Description**

- Added stream benchmark
- Added stream unit test
- Added stream example
- Modified docker files to build stream

---------
Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net>
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>
Co-authored-by: Yifan Xiong <xiongyf@yandex.com>

32896ca4

29 Dec, 2022 1 commit

Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449) · a3c65b2a

Yifan Xiong authored Dec 29, 2022

Add Docker image for arch90 NVIDIA GPUs:

* add CUDA11.8 Dockerfile
* update archs in Makefile and benchmarks accordingly
* update image build pipeline

a3c65b2a

31 Oct, 2022 1 commit

CLI - Update version to include revision hash and date (#427) · d7bb8303

Yifan Xiong authored Oct 31, 2022

Update version to include revision hash and date in "{last tag}+g{git
hash}.d{date}" format, here're the examples:
* exact tag: 0.6.0
* commit after tag: 0.6.0+gcbb1b34
* commit after tag with local changes: 0.6.0+gcbb1b34.d20221028

d7bb8303

06 Sep, 2022 1 commit

Release - SuperBench v0.6.0 (#409) · 63e9b2d1

Yifan Xiong authored Sep 06, 2022



**Description**

Cherry-pick bug fixes from v0.6.0 to main.

**Major Revisions**

* Enable latency test in ib traffic validation distributed benchmark (#396)
* Enhance parameter parsing to allow spaces in value (#397)
* Update apt packages in dockerfile (#398)
* Upgrade colorlog for NO_COLOR support (#404)
* Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
* Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
* Enhance timeout cleanup to avoid possible hanging (#405)
* Auto generate ibstat file by pssh (#402)
* Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
* Docs - Upgrade version and release note (#407)
* Docs - Fix issues in document (#408)
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

63e9b2d1

17 Aug, 2022 1 commit

Update Python setup for require packages (#387) · 626ac0a4

Yifan Xiong authored Aug 17, 2022

__Description__

Update Python setup for require packages.

__Major Revisions__
* downgrade requests version to be compatible with python 3.6, add corresponding pipeline for 3.6
* add extra entry in extras_require for nested packages
* update `pip install` contents accordingly

626ac0a4

13 Aug, 2022 1 commit

Auto generate ibstat file for topo aware traffic pattern (#381) · faeee0a7

Yang Wang authored Aug 13, 2022

An enhancement for topo-aware IB performance validation #373.
This PR will auto-generate a required ibstate file `ib_traffic_topo_aware_ibstat.txt` which is used as input to build a graph.

faeee0a7

13 Jul, 2022 1 commit

Add dependencies (#374) · 16b6385d

Yifan Xiong authored Jul 13, 2022

Add dependencies

* include ndv4-topo.xml in cuda docker images
* require requests version to avoid RequestsDependencyWarning

16b6385d

06 Jul, 2022 1 commit

Update dependencies and Dockerfile (#371) · 9f03d568

Yifan Xiong authored Jul 06, 2022

Update dependencies and Dockerfile:
* upgrade nccl-tests and rccl-tests to current latest version to match
  NCCL/RCCL versions
* unify image tag names on DockerHub
* remove verbose output in Dockerfile and minor fix some flags

9f03d568

24 Jun, 2022 2 commits

Fix incorrect ulimit config in Dockerfile (#364) · 325a7338

Yifan Xiong authored Jun 24, 2022

Fix incorrect ulimit nofile config in Dockerfile.

Instead of bash, sh is used by default where `echo` does not accept any parameters and `-e` is written into /etc/security/limits.conf.

325a7338

Support multiple IB/GPU in ib validation (#363) · bfaa1c83

Yifan Xiong authored Jun 24, 2022

**Description**

Support multiple IB/GPU devices run simultaneously in ib validation benchmark.

**Major Revisions**
- Revise ib_validation_performance.cc so that multiple processes per node could be used to launch multiple perftest commands simultaneously. For each node pair in the config, number of processes per node will run in parallel.
- Revise ib_validation_performance.py to correct file paths and adjust parameters to specify different NICs/GPUs/NUMA nodes.
- Fix env issues in Dockerfile for end-to-end test.
- Update ib-traffic configuration examples in config files.
- Update unit tests and docs accordingly.

Closes #326.

bfaa1c83

15 Jun, 2022 1 commit

Fix cmake and build issues (#360) · 60a3c743

Yifan Xiong authored Jun 15, 2022

**Description**

Fix cmake and build issues.

**Major Revision**

* Remove unnecessary boost build
* Remove user-agent for mlc
* Remove -j for third party to build each project in sequence
* Fix ansible collections installation path

60a3c743

31 May, 2022 1 commit
- Dockerfile - Add support to run sb command inside docker image (#356) · 3f135e46
  user4543 authored Jun 01, 2022
```
**Description**
Add support to run sb command inside docker image - install missing dependency.
```
  3f135e46
08 Feb, 2022 1 commit

Benchmarks: Add Feature - Add GDR-only nccl-tests for Nvidia machines (#299) · 433785fd

Ziyue Yang authored Feb 08, 2022

This commit adds GDR-only nccl-tests for Nvidia machines. Also bump NCCL to v2.10.3-1 to achieve peak performance in this test.

433785fd

13 Dec, 2021 1 commit

Benchmarks: Add Benchmark - Add mlc benchmark to superbench (#216) · b590409e

Hossein Pourreza authored Dec 12, 2021

**Description**
Add mlc memory bandwidth and latency micro benchmark to Superbench.

**Major Revision**
- Add mlc benchmark with test and example files

b590409e

10 Dec, 2021 1 commit

Benchmarks: Add Benchmark - Add ONNXRuntime inference benchmark based on ORT python API (#245) · 4d85630a

guoshzhao authored Dec 10, 2021

**Description**
Add ONNXRuntime inference benchmark based on ORT python API.

**Major Revision**
- Add `ORTInferenceBenchmark` class to export pytorch model to onnx model and do inference
- Add tests and example for `ort-inference` benchmark
- Update the introduction docs.

4d85630a

30 Oct, 2021 1 commit

Benchmarks: Add Feature - Add CPU-initiated copy and dtod support to gpu-sm-copy benchmark (#230) · 008e0fe1

Ziyue Yang authored Oct 30, 2021

**Description**
This commit does the following:
1) Adds CPU-initiated copy benchmark;
2) Adds dtod benchmark;
3) Support scanning NUMA nodes and GPUs inside the benchmark program;
4) Change the name of gpu-sm-copy to gpu-copy.

008e0fe1

02 Sep, 2021 1 commit

Dockerfile - Fix ulimit nofile in Docker images (#183) · 4e431f11

Yifan Xiong authored Sep 02, 2021

__Description__

Resolve "too many open files" issue when runnning NCCL/RCCL on multiple nodes using Docker images, increase nofile number in limits.conf.

4e431f11

01 Sep, 2021 2 commits
- Dockerfile: Add Package - Install openmpi for ROCm images (#181) · 115cd2e6
  guoshzhao authored Sep 01, 2021
```
**Description**
Install openmpi-4.0.0 for ROCm images.
```
  115cd2e6
- Benchmarks: Docker Benchmarks - Setup Docker-in-Docker environment (#180) · 7d947757
  guoshzhao authored Sep 01, 2021
```
**Description**
Setup docker environment in docker container.

**Major Revision**
- Install docker client for cuda and rocm images.
- Mount /var/run/docker.sock from host
```
  7d947757
29 Jul, 2021 1 commit

Benchmarks: Build Pipeline - Support rocm in third_party/makefile (#140) · c88ce056

Yuting Jiang authored Jul 29, 2021

**Description**
Support rocm in third_party/makefile.

**Major Revision**
- Split rocm and cuda target in makefile
- Add target in dockerfile

c88ce056

16 Jul, 2021 2 commits
- Benchmarks: Build Pipeline - Add perftest as a submodule and add build logic (#129) · 419dea26
  Yuting Jiang authored Jul 16, 2021
```
Add perftest as a submodule and add build logic
```
  419dea26
- Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic. (#128) · 8c8beb4b
  Yuting Jiang authored Jul 16, 2021
```
Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.
```
  8c8beb4b
16 Jun, 2021 1 commit

Dockerfile - Update CUDA 11.1.1 Dockerfile (#96) · 25ec3a7c

Yifan Xiong authored Jun 16, 2021

Update packages and add build cache for CUDA 11.1.1 Dockerfile:

* Remove duplicate cmake and ompi, which are already in base image
* Add hpcx and sharp lib
* Add cache for gitmodules build
* Sort apt-get packages

25ec3a7c

01 Jun, 2021 2 commits
- Benchmarks: Add Feature - Add nvml package to provide python interfaces of nvidia. (#91) · 331c740a
  guoshzhao authored Jun 01, 2021
  
  331c740a
- Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85) · 40d7905e
  guoshzhao authored Jun 01, 2021
```
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
```
  40d7905e
18 May, 2021 1 commit
- Benchmarks: Build Pipeline - Call build script when setup environment. (#76) · 94d3765b
  guoshzhao authored May 18, 2021
```
* call build script in Makefile.
* add cppbuild command for testing and docker env.
```
  94d3765b
17 May, 2021 1 commit
- CI/CD - Add GitHub Action to build and push image (#70) · af6eb004
  Yifan Xiong authored May 17, 2021
```
* add GitHub Action to build and push image
* update Dockerfile to copy from context
```
  af6eb004
14 Apr, 2021 1 commit
- Setup: Code Revision - Rename dev branch to main in config and readme (#55) · 74c4d1b2
  Yifan Xiong authored Apr 14, 2021
```
* Rename dev branch to main and set it as default.
```
  74c4d1b2
13 Apr, 2021 1 commit

Executor - Fix issues when executing benchmarks (#51) · 8c527308

Yifan Xiong authored Apr 13, 2021

* fix missing package in dockerfile
* update benchmark list and parameters
* catch runtime errors
* refine logging info

8c527308