Commits · b98642442e2b6d86290a0c7c2c1822e8fcaf09b3 · tsoc / superbenchmark

08 Oct, 2025 1 commit
- CI/CD - Fix image merge in GitHub Action. (#749) · b9864244
  Yifan Xiong authored Oct 07, 2025
```
Fix image merge for release event in GitHub Action.
```
  b9864244
01 Oct, 2025 1 commit

Dockerfile - add cuda13.0.dockerfile (#739) · 60189dd6

WenqingLan1 authored Oct 01, 2025



Add support for cuda13.0.
Add cuda13.0.dockerfile.
Add cuda13.0 image building task to github pipeline.
Update GPU STREAM to work with cuda13.0.
Fix data type conversion perf bug in GPU stream.
Update nvbandwidth submodule to be v0.8.
Update perftest submodule to be 4bee61f80d9e268fc97eaf40be00409e91d3a19e
(recent master).

---------
Co-authored-by: Ubuntu <dilipreddi@gmail.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

60189dd6

12 Aug, 2025 1 commit

Release - SuperBench v0.12.0 (#729) · 0b4311cd

Hongtao Zhang authored Aug 12, 2025



**Description**

Cherry-pick bug fixes from v0.12.0 to main.

**Major Revisions**

* #725
* #727
* #728
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yixio@microsoft.com>
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>

---------
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>

0b4311cd

25 Jun, 2025 1 commit

Dockerfile - Add cuda12.9 docker image (#716) · a56356d8

guoshzhao authored Jun 25, 2025



**Description**
Add cuda 12.9 dockerfile and build in pipeline.

---------
Co-authored-by: Guoshuai Zhao <microsoft@microsoft.com>
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Hongtao Zhang <garyworkzht@gmail.com>

a56356d8

09 Apr, 2025 1 commit
- CI/CD - Merge multi-arch image (#696) · b13ef28f
  Yifan Xiong authored Apr 08, 2025
```
Merge multi-arch image in build pipeline.
```
  b13ef28f
21 Mar, 2025 1 commit

Dockerfile - Support cuda12.8 for Blackwell arch (#682) · 294f1f20

pdr authored Mar 20, 2025



**Description**
Updated docker for 12.8
Use cutlass latest relase 3.8 with ARCH 100(blackwell) support
add latest nccl-test release with ARCH 100(blackwell) 
Updated msccl to support build for sm_100
No breaking changes, so backward compatible tested with  cuda 12.4

---------
Co-authored-by: Hongtao Zhang <garyworkzht@gmail.com>

294f1f20

12 Mar, 2025 1 commit

CI/CD - Update label in the ROCm image build (#693) · 48cd8a3c

Hongtao Zhang authored Mar 12, 2025



Due to the matrix strategy’s default "fail-fast" setting. In GitHub
Actions, when running a job with a matrix, the individual configurations
run in parallel. By default, if one matrix job (for example, the one
labeled "rocm6_2_rocm6_2_x_superbe") fails, the remaining parallel jobs
are canceled automatically.

In our current build image pipeline, the arm64 build job always are
canceled by the rocm build job. So, using a non-existent label in the
job config to prevent rocm build job from scheduling for a temporary
solution.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

48cd8a3c

07 Mar, 2025 1 commit
- CI/CD - Add image build on arm64 arch (#690) · 300df46b
  Yifan Xiong authored Mar 07, 2025
```
Add image build on arm64 arch.
```
  300df46b
06 Nov, 2024 1 commit

Dockerfile - Add support for arm64 build (#660) · 47949127

pdr authored Nov 06, 2024

Add support for arm64 build:

- Updated dockerfile for arm64 build
- extend cpu stream compilation for neoverse 
- handle onnxruntime-gpu installation
- third party builds filtering based on arch
- disable cuda decode perf build for non x86

47949127

02 Nov, 2024 1 commit

CI/CD - Update Image Build Pipeline (#659) · 61770b89

Yifan Xiong authored Nov 01, 2024

**Description**

Update image build.

**Major Revision**

* Remove ROCm 6.0 image due to outdated packages
* Remove build tag for ROCm
* Preserve build cache for 30 days

61770b89

10 Oct, 2024 1 commit

Release - SuperBench v0.11.0 (#654) · 949f9cb4

Yuting Jiang authored Oct 10, 2024



**Description**
Cherry pick bug fixes from v0.11.0 to main

**Major Revision**
* #645 
* #648 
* #646 
* #647 
* #651 
* #652 
* #650

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

949f9cb4

28 Jul, 2024 1 commit
- CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline (#633) · 2101e933
  Yuting Jiang authored Jul 29, 2024
```
**Description**
Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM
issue.
```
  2101e933
22 Apr, 2024 1 commit

Dockerfile - Add CUDA 12.4 dockerfile (#619) · 7435f10a

Yuting Jiang authored Apr 22, 2024

**Description**
Add CUDA 12.4 dockerfile.

**Major Revision**
- upgrade nvidia docker into 23.04


**Minor Revision**
- upgrade hpcx into 2.18

7435f10a

08 Jan, 2024 1 commit

Release - SuperBench v0.10.0 (#607) · 2c88db90

Yifan Xiong authored Jan 07, 2024

**Description**

Cherry-pick bug fixes from v0.10.0 to main.

**Major Revisions**

* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - U...

2c88db90

09 Dec, 2023 1 commit

Dockerfile - Upgrade to rocm5.7 dockerfile (#587) · 1f5031bd

Yuting Jiang authored Dec 10, 2023



**Description**
upgrade to rocm5.7 dockerfile.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>

1f5031bd

07 Dec, 2023 1 commit
- Benchmarks: Add MSCCL Support for Nvidia GPU (#584) · 6ef3a011
  Ziyue Yang authored Dec 07, 2023
```
**Description**
Add MSCCL support for Nvidia GPU
```
  6ef3a011
22 Nov, 2023 1 commit

Dockerfile - Upgrade Docker image to CUDA 12.2 (#577) · 1ad1c21c

Yifan Xiong authored Nov 22, 2023

Upgrade Docker image to CUDA 12.2 for H100:
* upgrade base image to 23.10
* fix onnxruntime version in python3.10
* fix compilation errors

1ad1c21c

18 Aug, 2023 1 commit
- Benchmarks: micro benchmarks - add source code for DirectXRenderPerf (#549) · 6c0205ce
  Yuting Jiang authored Aug 18, 2023
```
**Description**
add source code for DirectXRenderPerf.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
```
  6c0205ce
27 Jul, 2023 1 commit

Release - SuperBench v0.9.0 (#558) · e1df877b

Yuting Jiang authored Jul 27, 2023

**Description**
Cherry-pick bug fixes from v0.9.0 to main.

**Major Revision**
- CI/CD: pipeline - clean more disk space to fix rocm building image
pipeline(#555 )
- Benchmarks: bug fix - use absolute path for input file in
DirectXEncodingLatency(#554)
- CI/CD - add push win docker image on release branch in pipeline (#552)
- Docs - Upgrade version and release note(#557)

e1df877b

14 Apr, 2023 1 commit

Release - SuperBench v0.8.0 (#517) · 51761b3a

Yifan Xiong authored Apr 14, 2023



**Description**

Cherry-pick bug fixes from v0.8.0 to main.

**Major Revisions**

* Monitor - Fix the cgroup version checking logic (#502)
* Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
* Fix wrong torch usage in communication wrapper for Distributed
Inference Benchmark (#505)
* Analyzer: Fix bug in python3.8 due to pandas api change (#504)
* Bug - Fix bug to get metric from cmd when error happens (#506)
* Monitor - Collect realtime GPU power when benchmarking (#507)
* Add num_workers argument in model benchmark (#511)
* Remove unreachable condition when write host list (#512)
* Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
* Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
* Docs - Upgrade version and release note (#508)
Co-authored-by: guoshzhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

51761b3a

23 Feb, 2023 1 commit
- CI/CD - Free disk space in GitHub Action VHD (#481) · bbb86c4a
  Yifan Xiong authored Feb 23, 2023
```
Free more disk space in GitHub Action VHD.
```
  bbb86c4a
29 Dec, 2022 1 commit

Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449) · a3c65b2a

Yifan Xiong authored Dec 29, 2022

Add Docker image for arch90 NVIDIA GPUs:

* add CUDA11.8 Dockerfile
* update archs in Makefile and benchmarks accordingly
* update image build pipeline

a3c65b2a

06 Jul, 2022 1 commit

Update dependencies and Dockerfile (#371) · 9f03d568

Yifan Xiong authored Jul 06, 2022

Update dependencies and Dockerfile:
* upgrade nccl-tests and rccl-tests to current latest version to match
  NCCL/RCCL versions
* unify image tag names on DockerHub
* remove verbose output in Dockerfile and minor fix some flags

9f03d568

19 Jun, 2022 1 commit

Update ROCm Dockerfile (#361) · 483bf782

Yifan Xiong authored Jun 19, 2022

**Description**

Update ROCm Dockerfile.

**Major Revisions**
- Add dockerfile for ROCm 5.1.3
- Merge 5.1.x and 5.0.x dockerfile
- Remove 4.2 and 4.0 legacy
- Update build pipeline accordingly

483bf782

25 May, 2022 1 commit
- Dockerfile - Add dockerfile for rocm5.1.1 (#353) · 81a4146b
  user4543 authored May 25, 2022
```
**Description**
Add dockerfile for rocm5.1.1.
```
  81a4146b
28 Feb, 2022 1 commit
- Dockerfile - Add dockerfile for rocm5.0.1 (#319) · 425b9ff8
  user4543 authored Feb 28, 2022
```
**Description**
Add dockerfile for rocm5.0.1.
```
  425b9ff8
25 Feb, 2022 1 commit
- Dockerfile - Add rocm5.0 dockerfile (#307) · a4950a70
  user4543 authored Feb 26, 2022
```
**Description**
Add rocm5.0 dockerfile.
```
  a4950a70
08 Feb, 2022 1 commit

Benchmarks: Add Feature - Add GDR-only nccl-tests for Nvidia machines (#299) · 433785fd

Ziyue Yang authored Feb 08, 2022

This commit adds GDR-only nccl-tests for Nvidia machines. Also bump NCCL to v2.10.3-1 to achieve peak performance in this test.

433785fd

26 Sep, 2021 1 commit

Release - SuperBench v0.3.0 (#212) · dfbd70b1

Yifan Xiong authored Sep 26, 2021



**Description**

Cherry-pick  bug fixes from v0.3.0 to main.

**Major Revisions**
* Docs - Upgrade version and release note (#209)
* Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
* Benchmarks: Update - Update benchmarks in configuration file (#208)
* CI/CD - Update GitHub Action VM (#211)
* Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
* CI/CD - Fix bug in build image for push event (#205)
* Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
* Tool: Fix bug - Fix function naming issue in system info  (#200)
* CI/CD - Push images in GitHub Action (#202)
* Bug - Fix torch.distributed command for single node (#201)
* CLI - Integrate system info for node (#199)
* Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
* CI/CD - Add ROCm image build in GitHub Actions (#194)
* Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
* Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
* Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
* Bug - Revise 'docker run' in sb deploy (#195)
* Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
Co-authored-by: Yuting Jiang <v-yujiang@microsoft.com>
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>

dfbd70b1

09 Jul, 2021 1 commit

Bug bash - Merge fix from release/0.2 to main (#124) · 9c984c7e

guoshzhao authored Jul 09, 2021



* Bug Fix - Fix race condition issue for multi ranks (#117)

Fix race condition issue when multi ranks rotating the same directory.

* Update pipeline for release branch (#122)

* Bug Fix - Fix bug when convert bool config to store_true argument. (#120)
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

9c984c7e

16 Jun, 2021 1 commit

Dockerfile - Update CUDA 11.1.1 Dockerfile (#96) · 25ec3a7c

Yifan Xiong authored Jun 16, 2021

Update packages and add build cache for CUDA 11.1.1 Dockerfile:

* Remove duplicate cmake and ompi, which are already in base image
* Add hpcx and sharp lib
* Add cache for gitmodules build
* Sort apt-get packages

25ec3a7c

01 Jun, 2021 1 commit
- Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85) · 40d7905e
  guoshzhao authored Jun 01, 2021
```
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
```
  40d7905e
17 May, 2021 1 commit
- CI/CD - Add GitHub Action to build and push image (#70) · af6eb004
  Yifan Xiong authored May 17, 2021
```
* add GitHub Action to build and push image
* update Dockerfile to copy from context
```
  af6eb004