Commits · e27491dff61164d2d9f6ce6c0532891c812827ea · tsoc / superbenchmark

06 Aug, 2025 1 commit

CI/CD - Merge cuda12.9 images. (#728) · e27491df

Hongtao Zhang authored Aug 05, 2025



**Description**
Merge ARM64 and AMD64 images into a single multi-architecture Docker
manifest under one artifact namespace.
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>

e27491df

25 Jun, 2025 1 commit

Dockerfile - Add cuda12.9 docker image (#716) · a56356d8

guoshzhao authored Jun 25, 2025



**Description**
Add cuda 12.9 dockerfile and build in pipeline.

---------
Co-authored-by: Guoshuai Zhao <microsoft@microsoft.com>
Co-authored-by: Hongtao Zhang <hongtaozhang@microsoft.com>
Co-authored-by: Hongtao Zhang <garyworkzht@gmail.com>

a56356d8

05 Jun, 2025 1 commit
- Update CODEOWNERS (#718) · 431bf19c
  Yifan Xiong authored Jun 05, 2025
```
Update CODEOWNERS.
```
  431bf19c
30 Apr, 2025 1 commit

CI/CD - Update OS of runner to the latest. (#702) · 330c68aa

Hongtao Zhang authored Apr 30, 2025



- Upgrade OS of github runner used by lint to the latest.
- Add symbolic link for clang-format to version 14.
- Update importlib_metadata version since it is too old (inside
nvcr.io/nvidia/pytorch:20.12-py3) and failed the 11.1 build.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

330c68aa

09 Apr, 2025 1 commit
- CI/CD - Merge multi-arch image (#696) · b13ef28f
  Yifan Xiong authored Apr 08, 2025
```
Merge multi-arch image in build pipeline.
```
  b13ef28f
21 Mar, 2025 1 commit

Dockerfile - Support cuda12.8 for Blackwell arch (#682) · 294f1f20

pdr authored Mar 20, 2025



**Description**
Updated docker for 12.8
Use cutlass latest relase 3.8 with ARCH 100(blackwell) support
add latest nccl-test release with ARCH 100(blackwell) 
Updated msccl to support build for sm_100
No breaking changes, so backward compatible tested with  cuda 12.4

---------
Co-authored-by: Hongtao Zhang <garyworkzht@gmail.com>

294f1f20

12 Mar, 2025 1 commit

CI/CD - Update label in the ROCm image build (#693) · 48cd8a3c

Hongtao Zhang authored Mar 12, 2025



Due to the matrix strategy’s default "fail-fast" setting. In GitHub
Actions, when running a job with a matrix, the individual configurations
run in parallel. By default, if one matrix job (for example, the one
labeled "rocm6_2_rocm6_2_x_superbe") fails, the remaining parallel jobs
are canceled automatically.

In our current build image pipeline, the arm64 build job always are
canceled by the rocm build job. So, using a non-existent label in the
job config to prevent rocm build job from scheduling for a temporary
solution.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

48cd8a3c

07 Mar, 2025 1 commit
- CI/CD - Add image build on arm64 arch (#690) · 300df46b
  Yifan Xiong authored Mar 07, 2025
```
Add image build on arm64 arch.
```
  300df46b
21 Nov, 2024 1 commit
- Docs - Update CODEOWNERS (#670) · 54eeac25
  Yifan Xiong authored Nov 20, 2024
```
Update CODEOWNERS for docs.
```
  54eeac25
06 Nov, 2024 1 commit

Dockerfile - Add support for arm64 build (#660) · 47949127

pdr authored Nov 06, 2024

Add support for arm64 build:

- Updated dockerfile for arm64 build
- extend cpu stream compilation for neoverse 
- handle onnxruntime-gpu installation
- third party builds filtering based on arch
- disable cuda decode perf build for non x86

47949127

02 Nov, 2024 1 commit

CI/CD - Update Image Build Pipeline (#659) · 61770b89

Yifan Xiong authored Nov 01, 2024

**Description**

Update image build.

**Major Revision**

* Remove ROCm 6.0 image due to outdated packages
* Remove build tag for ROCm
* Preserve build cache for 30 days

61770b89

10 Oct, 2024 1 commit

Release - SuperBench v0.11.0 (#654) · 949f9cb4

Yuting Jiang authored Oct 10, 2024



**Description**
Cherry pick bug fixes from v0.11.0 to main

**Major Revision**
* #645 
* #648 
* #646 
* #647 
* #651 
* #652 
* #650

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

949f9cb4

28 Jul, 2024 1 commit
- CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline (#633) · 2101e933
  Yuting Jiang authored Jul 29, 2024
```
**Description**
Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM
issue.
```
  2101e933
22 Apr, 2024 1 commit

Dockerfile - Add CUDA 12.4 dockerfile (#619) · 7435f10a

Yuting Jiang authored Apr 22, 2024

**Description**
Add CUDA 12.4 dockerfile.

**Major Revision**
- upgrade nvidia docker into 23.04


**Minor Revision**
- upgrade hpcx into 2.18

7435f10a

08 Jan, 2024 1 commit

Release - SuperBench v0.10.0 (#607) · 2c88db90

Yifan Xiong authored Jan 07, 2024



**Description**

Cherry-pick bug fixes from v0.10.0 to main.

**Major Revisions**

* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - Upgrade pyrsmi to amdsmi python library. #601
* Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
* Dockerfile - Add rocm6.0 dockerfile #602
* Bug Fix - Bug fix for latest megatron-lm benchmark #600
* Docs - Upgrade version and release note #606
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

2c88db90

09 Dec, 2023 1 commit

Dockerfile - Upgrade to rocm5.7 dockerfile (#587) · 1f5031bd

Yuting Jiang authored Dec 10, 2023



**Description**
upgrade to rocm5.7 dockerfile.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>

1f5031bd

07 Dec, 2023 1 commit
- Benchmarks: Add MSCCL Support for Nvidia GPU (#584) · 6ef3a011
  Ziyue Yang authored Dec 07, 2023
```
**Description**
Add MSCCL support for Nvidia GPU
```
  6ef3a011
22 Nov, 2023 1 commit

Dockerfile - Upgrade Docker image to CUDA 12.2 (#577) · 1ad1c21c

Yifan Xiong authored Nov 22, 2023

Upgrade Docker image to CUDA 12.2 for H100:
* upgrade base image to 23.10
* fix onnxruntime version in python3.10
* fix compilation errors

1ad1c21c

22 Aug, 2023 1 commit
- Benchmarks: micro benchmark - source code for evaluating NVDEC decoding performance (#560) · 27a10811
  Yuting Jiang authored Aug 22, 2023
```
**Description**
source code for evaluating NVDEC decoding performance.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
```
  27a10811
18 Aug, 2023 1 commit
- Benchmarks: micro benchmarks - add source code for DirectXRenderPerf (#549) · 6c0205ce
  Yuting Jiang authored Aug 18, 2023
```
**Description**
add source code for DirectXRenderPerf.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
```
  6c0205ce
27 Jul, 2023 1 commit

Release - SuperBench v0.9.0 (#558) · e1df877b

Yuting Jiang authored Jul 27, 2023

**Description**
Cherry-pick bug fixes from v0.9.0 to main.

**Major Revision**
- CI/CD: pipeline - clean more disk space to fix rocm building image
pipeline(#555 )
- Benchmarks: bug fix - use absolute path for input file in
DirectXEncodingLatency(#554)
- CI/CD - add push win docker image on release branch in pipeline (#552)
- Docs - Upgrade version and release note(#557)

e1df877b

05 Jul, 2023 3 commits
- Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw (#547) · af4cfd5b
  Yuting Jiang authored Jul 05, 2023
```
**Description**
add python code for DirecXGPUMemBw.
```
  af4cfd5b
- Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops (#542) · f1d608ae
  Yuting Jiang authored Jul 05, 2023
```
**Description**
add python code for DirectX core flops and init DirectX test pipeline.

**Major Revision**
- add python code for DirectX core flops 
- init DirectX test pipeline


**Minor Revision**
- add test for DirectX core flops
```
  f1d608ae
- CI/CD - Support DirectX test pipeline (#545) · 3704a432
  Yuting Jiang authored Jul 05, 2023
```
**Description**
Support DirectX test pipeline.
```
  3704a432
28 Jun, 2023 1 commit

Dockerfile - Add SuperBench Windows Dockerfile (#534) · 44ef5314

Yuting Jiang authored Jun 28, 2023



**Description**
Add dockerfile for win10 and building script for directx_benchmarks.

**Major Revision**
- Add docker file for win10 and required scripts to install the
dependency
- Add building script to build all directx vs benchmarks
- Add call of building script in Makefile

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

44ef5314

14 Apr, 2023 1 commit

Release - SuperBench v0.8.0 (#517) · 51761b3a

Yifan Xiong authored Apr 14, 2023



**Description**

Cherry-pick bug fixes from v0.8.0 to main.

**Major Revisions**

* Monitor - Fix the cgroup version checking logic (#502)
* Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
* Fix wrong torch usage in communication wrapper for Distributed
Inference Benchmark (#505)
* Analyzer: Fix bug in python3.8 due to pandas api change (#504)
* Bug - Fix bug to get metric from cmd when error happens (#506)
* Monitor - Collect realtime GPU power when benchmarking (#507)
* Add num_workers argument in model benchmark (#511)
* Remove unreachable condition when write host list (#512)
* Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
* Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
* Docs - Upgrade version and release note (#508)
Co-authored-by: guoshzhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

51761b3a

23 Feb, 2023 1 commit
- CI/CD - Free disk space in GitHub Action VHD (#481) · bbb86c4a
  Yifan Xiong authored Feb 23, 2023
```
Free more disk space in GitHub Action VHD.
```
  bbb86c4a
29 Dec, 2022 1 commit

Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449) · a3c65b2a

Yifan Xiong authored Dec 29, 2022

Add Docker image for arch90 NVIDIA GPUs:

* add CUDA11.8 Dockerfile
* update archs in Makefile and benchmarks accordingly
* update image build pipeline

a3c65b2a

18 Oct, 2022 1 commit

Benchmarks - Add support to allow list of custom config string in... · 3367c4f6

Yuting Jiang authored Oct 18, 2022

Benchmarks - Add support to allow list of custom config string in cudnn-functions and cublas-functions (#414)

**Description**
Add support to allow list of custom config string in cudnn-functions and cublas-functions.

3367c4f6

06 Jul, 2022 1 commit

Update dependencies and Dockerfile (#371) · 9f03d568

Yifan Xiong authored Jul 06, 2022

Update dependencies and Dockerfile:
* upgrade nccl-tests and rccl-tests to current latest version to match
  NCCL/RCCL versions
* unify image tag names on DockerHub
* remove verbose output in Dockerfile and minor fix some flags

9f03d568

19 Jun, 2022 1 commit

Update ROCm Dockerfile (#361) · 483bf782

Yifan Xiong authored Jun 19, 2022

**Description**

Update ROCm Dockerfile.

**Major Revisions**
- Add dockerfile for ROCm 5.1.3
- Merge 5.1.x and 5.0.x dockerfile
- Remove 4.2 and 4.0 legacy
- Update build pipeline accordingly

483bf782

25 May, 2022 1 commit
- Dockerfile - Add dockerfile for rocm5.1.1 (#353) · 81a4146b
  user4543 authored May 25, 2022
```
**Description**
Add dockerfile for rocm5.1.1.
```
  81a4146b
28 Feb, 2022 1 commit
- Dockerfile - Add dockerfile for rocm5.0.1 (#319) · 425b9ff8
  user4543 authored Feb 28, 2022
```
**Description**
Add dockerfile for rocm5.0.1.
```
  425b9ff8
25 Feb, 2022 1 commit
- Dockerfile - Add rocm5.0 dockerfile (#307) · a4950a70
  user4543 authored Feb 26, 2022
```
**Description**
Add rocm5.0 dockerfile.
```
  a4950a70
08 Feb, 2022 1 commit

Benchmarks: Add Feature - Add GDR-only nccl-tests for Nvidia machines (#299) · 433785fd

Ziyue Yang authored Feb 08, 2022

This commit adds GDR-only nccl-tests for Nvidia machines. Also bump NCCL to v2.10.3-1 to achieve peak performance in this test.

433785fd

12 Oct, 2021 1 commit

CI/CD - Disable version update, allow security update only (#224) · 5283bdeb

Yifan Xiong authored Oct 12, 2021

Disable dependabot version update, allow security update only.
Reference:
https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates#open-pull-requests-limit.

5283bdeb

11 Oct, 2021 1 commit

CI/CD - Add code security scanning (#206) · 849b6cac

Yifan Xiong authored Oct 11, 2021

Add code security scanning.

__Major Revisions__
* enable dependabot auto updates
* scan code with CodeQL

849b6cac

26 Sep, 2021 1 commit

Release - SuperBench v0.3.0 (#212) · dfbd70b1

Yifan Xiong authored Sep 26, 2021



**Description**

Cherry-pick  bug fixes from v0.3.0 to main.

**Major Revisions**
* Docs - Upgrade version and release note (#209)
* Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
* Benchmarks: Update - Update benchmarks in configuration file (#208)
* CI/CD - Update GitHub Action VM (#211)
* Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
* CI/CD - Fix bug in build image for push event (#205)
* Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
* Tool: Fix bug - Fix function naming issue in system info  (#200)
* CI/CD - Push images in GitHub Action (#202)
* Bug - Fix torch.distributed command for single node (#201)
* CLI - Integrate system info for node (#199)
* Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
* CI/CD - Add ROCm image build in GitHub Actions (#194)
* Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
* Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
* Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
* Bug - Revise 'docker run' in sb deploy (#195)
* Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
Co-authored-by: Yuting Jiang <v-yujiang@microsoft.com>
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>

dfbd70b1

09 Jul, 2021 1 commit

Bug bash - Merge fix from release/0.2 to main (#124) · 9c984c7e

guoshzhao authored Jul 09, 2021



* Bug Fix - Fix race condition issue for multi ranks (#117)

Fix race condition issue when multi ranks rotating the same directory.

* Update pipeline for release branch (#122)

* Bug Fix - Fix bug when convert bool config to store_true argument. (#120)
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

9c984c7e

25 Jun, 2021 1 commit
- Website - Initialize SuperBench website (#102) · e7b6af35
  Yifan Xiong authored Jun 25, 2021
```
* Initialize SuperBench website.
* Add GitHub Actions for automatically build and publish.
```
  e7b6af35