Commits · e304cf15728480ce9986e50fe30a7ca25ee40a3d · tsoc / superbenchmark

26 Jul, 2024 1 commit
- Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634) · e304cf15
  Yuting Jiang authored Jul 26, 2024
```
**Description**
Add support GPU ARCH 8.9 for NVIDIA L4/L40/L40s GPUs in gemm-flops.
```
  e304cf15
08 Jan, 2024 1 commit

Release - SuperBench v0.10.0 (#607) · 2c88db90

Yifan Xiong authored Jan 07, 2024



**Description**

Cherry-pick bug fixes from v0.10.0 to main.

**Major Revisions**

* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - Upgrade pyrsmi to amdsmi python library. #601
* Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
* Dockerfile - Add rocm6.0 dockerfile #602
* Bug Fix - Bug fix for latest megatron-lm benchmark #600
* Docs - Upgrade version and release note #606
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

2c88db90

09 Dec, 2023 1 commit

Dockerfile - Upgrade to rocm5.7 dockerfile (#587) · 1f5031bd

Yuting Jiang authored Dec 10, 2023



**Description**
upgrade to rocm5.7 dockerfile.

---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>

1f5031bd

07 Dec, 2023 2 commits
- Benchmarks: Add MSCCL Support for Nvidia GPU (#584) · 6ef3a011
  Ziyue Yang authored Dec 07, 2023
```
**Description**
Add MSCCL support for Nvidia GPU
```
  6ef3a011
- Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark (#582) · dd5a6329
  Yuting Jiang authored Dec 07, 2023
```
**Description**
Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
```
  dd5a6329
22 Nov, 2023 1 commit
- Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576) · 79089b65
  Yuting Jiang authored Nov 22, 2023
```
**Description**
hipblaslt function benchmark and rebase cublaslt function benchmark.
```
  79089b65
27 Jul, 2023 1 commit

Release - SuperBench v0.9.0 (#558) · e1df877b

Yuting Jiang authored Jul 27, 2023

**Description**
Cherry-pick bug fixes from v0.9.0 to main.

**Major Revision**
- CI/CD: pipeline - clean more disk space to fix rocm building image
pipeline(#555 )
- Benchmarks: bug fix - use absolute path for input file in
DirectXEncodingLatency(#554)
- CI/CD - add push win docker image on release branch in pipeline (#552)
- Docs - Upgrade version and release note(#557)

e1df877b

03 Jul, 2023 1 commit
- Benchmarks: Build Pipeline - add AMF in third party and build AMF encoding latency test (#543) · 86547217
  Yuting Jiang authored Jul 03, 2023
```
**Description**
add AMF in third party and build AMF encoding latency test.
```
  86547217
21 Mar, 2023 1 commit

Adding HPL benchmark (#482) · 655bd0aa

rafsalas19 authored Mar 21, 2023



**Description**

- Adding HPL benchmark

---------
Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net>
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>

655bd0aa

24 Feb, 2023 1 commit

Benchmarks: Build Pipeline - Add suppport for cpu-only perftest in makefile (#480) · 02923660

Yuting Jiang authored Feb 24, 2023



**Description**
Add suppport to install cpu-only perftest in makefile.
Co-authored-by: Yuting Jiang <yuting.jiang@microsoft.com>
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>

02923660

13 Feb, 2023 1 commit

Adding Stream Benchmark (#473) · 32896ca4

rafsalas19 authored Feb 13, 2023



**Description**

- Added stream benchmark
- Added stream unit test
- Added stream example
- Modified docker files to build stream

---------
Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net>
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>
Co-authored-by: Yifan Xiong <xiongyf@yandex.com>

32896ca4

29 Dec, 2022 1 commit

Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449) · a3c65b2a

Yifan Xiong authored Dec 29, 2022

Add Docker image for arch90 NVIDIA GPUs:

* add CUDA11.8 Dockerfile
* update archs in Makefile and benchmarks accordingly
* update image build pipeline

a3c65b2a

06 Jul, 2022 1 commit

Update dependencies and Dockerfile (#371) · 9f03d568

Yifan Xiong authored Jul 06, 2022

Update dependencies and Dockerfile:
* upgrade nccl-tests and rccl-tests to current latest version to match
  NCCL/RCCL versions
* unify image tag names on DockerHub
* remove verbose output in Dockerfile and minor fix some flags

9f03d568

19 Jun, 2022 1 commit

Update ROCm Dockerfile (#361) · 483bf782

Yifan Xiong authored Jun 19, 2022

**Description**

Update ROCm Dockerfile.

**Major Revisions**
- Add dockerfile for ROCm 5.1.3
- Merge 5.1.x and 5.0.x dockerfile
- Remove 4.2 and 4.0 legacy
- Update build pipeline accordingly

483bf782

15 Jun, 2022 1 commit

Fix cmake and build issues (#360) · 60a3c743

Yifan Xiong authored Jun 15, 2022

**Description**

Fix cmake and build issues.

**Major Revision**

* Remove unnecessary boost build
* Remove user-agent for mlc
* Remove -j for third party to build each project in sequence
* Fix ansible collections installation path

60a3c743

16 Mar, 2022 1 commit

Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce

rafsalas19 authored Mar 16, 2022

**Description**
Modifications adding GPU-Burn to SuperBench.
- added third party submodule
- modified Makefile to make gpu-burn binary
- added/modified microbenchmarks to add gpu-burn python scripts
- modified default and azure_ndv4 configs to add gpu-burn

ff51a3ce

24 Feb, 2022 1 commit
- Benchmarks: Build Pipeline - Make gpcnet only for cuda (#316) · 4f5027db
  user4543 authored Feb 24, 2022
```
**Description**
Make gpcnet only for cuda.
```
  4f5027db
09 Feb, 2022 1 commit
- Benchmarks: Build Pipeline - Update rccl-tests submodule to fix divide by zero error (#306) · 4abda6f5
  user4543 authored Feb 09, 2022
```
**Description**
Update rccl-tests submodule to fix divide by zero error.
```
  4abda6f5
29 Jan, 2022 1 commit
- Benchmarks - Support T4 and A10 in GEMM benchmark (#294) · 3419447c
  Yifan Xiong authored Jan 29, 2022
```
Support T4 and A10 in GEMM benchmark.
```
  3419447c
30 Dec, 2021 1 commit

Release - SuperBench v0.4.0 (#278) · ff563b66

Yifan Xiong authored Dec 30, 2021



__Description__

Cherry-pick  bug fixes from v0.4.0 to main.

__Major Revisions__

* Bug - Fix issues for Ansible and benchmarks (#267)
* Tests - Refine test cases for microbenchmark (#268)
* Bug - Build openmpi with ucx support in rocm dockerfiles (#269)
* Benchmarks: Fix Bug - Fix fio build issue (#272)
* Docs - Unify metric and add doc for cublas and cudnn functions (#271)
* Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274)
* Bug - Fix bug of detecting if gpu_index is none (#275)
* Bug - Fix bugs in data diagnosis (#273)
* Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270)
* Benchmarks: Configuration - Update inference and network benchmarks in configs (#276)
* Docs - Upgrade version and release note (#277)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

ff563b66

01 Dec, 2021 1 commit
- Benchmarks: Build Pipeline - Upgrade FIO benchmark tool (#251) · b0e759f5
  Ziyue Yang authored Dec 01, 2021
```
**Description**
Upgrade FIO benchmark tool from 3.27 to 3.28.
```
  b0e759f5
21 Oct, 2021 1 commit

Benchmarks: Build Pipeline - Add gpcnet as git submodule and building logic (#228) · b592a7c7

Yuting Jiang authored Oct 21, 2021

**Description**
Add gpcnet as git submodule and building logic.

**Major Revision**
- add gpcnet as a submodule
- add build logic in third_party/Makefile

b592a7c7

26 Sep, 2021 1 commit

Release - SuperBench v0.3.0 (#212) · dfbd70b1

Yifan Xiong authored Sep 26, 2021



**Description**

Cherry-pick  bug fixes from v0.3.0 to main.

**Major Revisions**
* Docs - Upgrade version and release note (#209)
* Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
* Benchmarks: Update - Update benchmarks in configuration file (#208)
* CI/CD - Update GitHub Action VM (#211)
* Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
* CI/CD - Fix bug in build image for push event (#205)
* Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
* Tool: Fix bug - Fix function naming issue in system info  (#200)
* CI/CD - Push images in GitHub Action (#202)
* Bug - Fix torch.distributed command for single node (#201)
* CLI - Integrate system info for node (#199)
* Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
* CI/CD - Add ROCm image build in GitHub Actions (#194)
* Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
* Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
* Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
* Bug - Revise 'docker run' in sb deploy (#195)
* Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
Co-authored-by: Yuting Jiang <v-yujiang@microsoft.com>
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>

dfbd70b1

31 Aug, 2021 1 commit

Benchmarks: Build Pipeline - Support rocblas building in... · b90b47f3

Yuting Jiang authored Sep 01, 2021

Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172)

**Description**
Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker.

**Major Revision**
- add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used

**Minor Revision**
- make rocm_version to be able to modify

b90b47f3

20 Aug, 2021 1 commit

Benchmarks: Build Pipeline - Add build logic of hipBusBandwidth in third_party (#151) · a1e5c90d

Yuting Jiang authored Aug 20, 2021

**Description**
Add build logic of hipBusBandwidth in third_party.

**Major Revision**
- Add build logic of hipBusBandwidth in third_party

a1e5c90d

02 Aug, 2021 1 commit

Benchmarks: Build Pipeline - Add rocBLAS building logic in third_party (#144) · 86c390a9

Yuting Jiang authored Aug 02, 2021

**Description**
Add rocBLAS building logic in third_party.

**Major Revision**
- Add rocm_rocblas target in third_party/Makefile.
- Add rocblas building logic

86c390a9

29 Jul, 2021 2 commits

Benchmarks: Build Pipeline - add rccl-tests as a submodule with building logic (#139) · a532eee4

Yuting Jiang authored Jul 30, 2021

**Description**
Support rocm in third_party/makefile and add rccl-tests as a submodule with building logic.

**Major Revision**
- Support rocm in third_party/makefile
- Add rccl-tests as a submodule 
- Add build logic in third_party/Makefile for rccl-tests

a532eee4

Benchmarks: Build Pipeline - Support rocm in third_party/makefile (#140) · c88ce056

Yuting Jiang authored Jul 29, 2021

**Description**
Support rocm in third_party/makefile.

**Major Revision**
- Split rocm and cuda target in makefile
- Add target in dockerfile

c88ce056

19 Jul, 2021 1 commit

Benchmarks: Build Pipeline - Add FIO benchmark tool (#127) · 4bbd7f51

Ziyue Yang authored Jul 19, 2021

**Description**
Add FIO benchmark tool into third-party dependency.

**Major Revision**
- Add FIO submodule into third-party directory and modify Makefile to enable it.

4bbd7f51

16 Jul, 2021 2 commits
- Benchmarks: Build Pipeline - Add perftest as a submodule and add build logic (#129) · 419dea26
  Yuting Jiang authored Jul 16, 2021
```
Add perftest as a submodule and add build logic
```
  419dea26
- Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic. (#128) · 8c8beb4b
  Yuting Jiang authored Jul 16, 2021
```
Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.
```
  8c8beb4b
15 Jul, 2021 1 commit

Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue... · 9547ccc1

Yuting Jiang authored Jul 15, 2021

Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue when building docker (#126)

* fix bug in docker build of third_party/cuda-samples

9547ccc1

13 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add memory bandwidth benchmark for cuda. (#114) · f9550bd6

Yuting Jiang authored Jul 13, 2021

Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile

f9550bd6

01 Jun, 2021 1 commit
- Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85) · 40d7905e
  guoshzhao authored Jun 01, 2021
```
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
```
  40d7905e