Commits · 4fa10f4d00632bb68657160d38bebafe3946aa46 · tsoc / superbenchmark

23 Apr, 2026 1 commit

Benchmarks: Add gpu-hpl and gpu-hpl-mxp micro benchmarks (#15) · 4fa10f4d

one authored Apr 23, 2026

Add gpu-hpl and gpu-hpl-mxp micro benchmarks backed by rocHPL and rocHPL-MxP.

Implemented a shared GPU HPL base that:
- Generates per-workload HPL dat files and parses the corresponding output files.
- Supports common HPL inputs such as process grid, matrix size, block size, broadcast topology, warmup, iterations, and reduce operator.
- Adds rocHPL-specific tuning parameters for gpu-hpl.
- Formats metric keys from input-derived workload attributes.
- Reports `flops`, `time`, and `tests_pass` metrics with warmup-aware aggregation.

Add benchmark registrations, parser tests, sample output fixtures, documentation, and recommended configurations for gpu-hpl and gpu-hpl-mxp.

Update rocHPL and rocHPL-MxP third-party integration with build patches, install targets, and SuperBench run helper scripts.

Also update gpu-hpcg metric naming to use flops instead of gflops, remove standalone domain/verification-style metrics from the documented metric surface, and refresh Hygon HPCG documentation/config references accordingly.

4fa10f4d

27 Mar, 2026 1 commit
- MicroBenchmark: rocHPCG · e4c2bd4c
  one authored Mar 27, 2026
  
  e4c2bd4c
19 Mar, 2026 1 commit

Update DTK dockerfile and microbenchmarks · c4f39919

one authored Mar 19, 2026

- Update rocm_commom.cmake for CMake>=3.24
- Prevent isolation build
- Add BabelStream as a submodule
- Update dockerignore

c4f39919

01 Oct, 2025 1 commit

Dockerfile - add cuda13.0.dockerfile (#739) · 60189dd6

WenqingLan1 authored Oct 01, 2025



Add support for cuda13.0.
Add cuda13.0.dockerfile.
Add cuda13.0 image building task to github pipeline.
Update GPU STREAM to work with cuda13.0.
Fix data type conversion perf bug in GPU stream.
Update nvbandwidth submodule to be v0.8.
Update perftest submodule to be 4bee61f80d9e268fc97eaf40be00409e91d3a19e
(recent master).

---------
Co-authored-by: Ubuntu <dilipreddi@gmail.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

60189dd6

21 Nov, 2024 1 commit

Benchmarks: micro benchmarks - add nvbandwidth build (#665) · c8c52eb2

Hongtao Zhang authored Nov 21, 2024



**Description**
Add nvbandwidth build to repo

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

c8c52eb2

08 Jan, 2024 1 commit

Release - SuperBench v0.10.0 (#607) · 2c88db90

Yifan Xiong authored Jan 07, 2024



**Description**

Cherry-pick bug fixes from v0.10.0 to main.

**Major Revisions**

* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - Upgrade pyrsmi to amdsmi python library. #601
* Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
* Dockerfile - Add rocm6.0 dockerfile #602
* Bug Fix - Bug fix for latest megatron-lm benchmark #600
* Docs - Upgrade version and release note #606
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

2c88db90

07 Dec, 2023 1 commit
- Benchmarks: Add MSCCL Support for Nvidia GPU (#584) · 6ef3a011
  Ziyue Yang authored Dec 07, 2023
```
**Description**
Add MSCCL support for Nvidia GPU
```
  6ef3a011
16 Mar, 2022 1 commit

Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce

rafsalas19 authored Mar 16, 2022

**Description**
Modifications adding GPU-Burn to SuperBench.
- added third party submodule
- modified Makefile to make gpu-burn binary
- added/modified microbenchmarks to add gpu-burn python scripts
- modified default and azure_ndv4 configs to add gpu-burn

ff51a3ce

21 Oct, 2021 1 commit

Benchmarks: Build Pipeline - Add gpcnet as git submodule and building logic (#228) · b592a7c7

Yuting Jiang authored Oct 21, 2021

**Description**
Add gpcnet as git submodule and building logic.

**Major Revision**
- add gpcnet as a submodule
- add build logic in third_party/Makefile

b592a7c7

29 Jul, 2021 1 commit

Benchmarks: Build Pipeline - add rccl-tests as a submodule with building logic (#139) · a532eee4

Yuting Jiang authored Jul 30, 2021

**Description**
Support rocm in third_party/makefile and add rccl-tests as a submodule with building logic.

**Major Revision**
- Support rocm in third_party/makefile
- Add rccl-tests as a submodule 
- Add build logic in third_party/Makefile for rccl-tests

a532eee4

19 Jul, 2021 1 commit

Benchmarks: Build Pipeline - Add FIO benchmark tool (#127) · 4bbd7f51

Ziyue Yang authored Jul 19, 2021

**Description**
Add FIO benchmark tool into third-party dependency.

**Major Revision**
- Add FIO submodule into third-party directory and modify Makefile to enable it.

4bbd7f51

16 Jul, 2021 2 commits
- Benchmarks: Build Pipeline - Add perftest as a submodule and add build logic (#129) · 419dea26
  Yuting Jiang authored Jul 16, 2021
```
Add perftest as a submodule and add build logic
```
  419dea26
- Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic. (#128) · 8c8beb4b
  Yuting Jiang authored Jul 16, 2021
```
Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.
```
  8c8beb4b
15 Jul, 2021 1 commit

Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue... · 9547ccc1

Yuting Jiang authored Jul 15, 2021

Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue when building docker (#126)

* fix bug in docker build of third_party/cuda-samples

9547ccc1

13 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add memory bandwidth benchmark for cuda. (#114) · f9550bd6

Yuting Jiang authored Jul 13, 2021

Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile

f9550bd6

01 Jun, 2021 1 commit
- Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85) · 40d7905e
  guoshzhao authored Jun 01, 2021
```
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
```
  40d7905e