Commits · 51761b3af172b4fc54ce0a3abc302e203d2bf44a · tsoc / superbenchmark

14 Apr, 2023 1 commit

Release - SuperBench v0.8.0 (#517) · 51761b3a

Yifan Xiong authored Apr 14, 2023



**Description**

Cherry-pick bug fixes from v0.8.0 to main.

**Major Revisions**

* Monitor - Fix the cgroup version checking logic (#502)
* Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
* Fix wrong torch usage in communication wrapper for Distributed
Inference Benchmark (#505)
* Analyzer: Fix bug in python3.8 due to pandas api change (#504)
* Bug - Fix bug to get metric from cmd when error happens (#506)
* Monitor - Collect realtime GPU power when benchmarking (#507)
* Add num_workers argument in model benchmark (#511)
* Remove unreachable condition when write host list (#512)
* Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
* Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
* Docs - Upgrade version and release note (#508)
Co-authored-by: guoshzhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

51761b3a

28 Jan, 2023 1 commit

Release - SuperBench v0.7.0 (#468) · b07fda15

Yifan Xiong authored Jan 28, 2023



**Description**

Cherry-pick bug fixes from v0.7.0 to main.

**Major Revisions**

* Benchmarks - Fix missing include in FP8 benchmark (#460)
* Fix bug in TE BERT model (#461)
* Doc - Update benchmark doc (#465)
* Bug: Fix bug for incorrect datatype judgement in cublas-function
source code (#464)
* Support `sb deploy` without pulling image (#466)
* Docs - Upgrade version and release note (#467)
Co-authored-by: Russell J. Hewett <russell.j.hewett@gmail.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

b07fda15

06 Sep, 2022 1 commit

Release - SuperBench v0.6.0 (#409) · 63e9b2d1

Yifan Xiong authored Sep 06, 2022



**Description**

Cherry-pick bug fixes from v0.6.0 to main.

**Major Revisions**

* Enable latency test in ib traffic validation distributed benchmark (#396)
* Enhance parameter parsing to allow spaces in value (#397)
* Update apt packages in dockerfile (#398)
* Upgrade colorlog for NO_COLOR support (#404)
* Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
* Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
* Enhance timeout cleanup to avoid possible hanging (#405)
* Auto generate ibstat file by pssh (#402)
* Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
* Docs - Upgrade version and release note (#407)
* Docs - Fix issues in document (#408)
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>

63e9b2d1

24 Jun, 2022 1 commit

Support multiple IB/GPU in ib validation (#363) · bfaa1c83

Yifan Xiong authored Jun 24, 2022

**Description**

Support multiple IB/GPU devices run simultaneously in ib validation benchmark.

**Major Revisions**
- Revise ib_validation_performance.cc so that multiple processes per node could be used to launch multiple perftest commands simultaneously. For each node pair in the config, number of processes per node will run in parallel.
- Revise ib_validation_performance.py to correct file paths and adjust parameters to specify different NICs/GPUs/NUMA nodes.
- Fix env issues in Dockerfile for end-to-end test.
- Update ib-traffic configuration examples in config files.
- Update unit tests and docs accordingly.

Closes #326.

bfaa1c83

29 Apr, 2022 1 commit

Release - SuperBench v0.5.0 (#350) · 6681c720

Yifan Xiong authored Apr 29, 2022



**Description**

Cherry-pick  bug fixes from v0.5.0 to main.

**Major Revisions**

* Bug - Force to fix ort version as '1.10.0' (#343)
* Bug - Support no matching rules and unify the output name in result_summary (#345)
* Analyzer - Support regex in annotations of benchmark naming for metrics in rules (#344)
* Bug - Fix bugs in sync results on root rank for e2e model benchmarks (#342)
* Bug - Fix bug of duration feature for model benchmarks in distributed mode (#347)
* Docs - Upgrade version and release note (#348)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

6681c720

16 Mar, 2022 1 commit

Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce

rafsalas19 authored Mar 16, 2022

**Description**
Modifications adding GPU-Burn to SuperBench.
- added third party submodule
- modified Makefile to make gpu-burn binary
- added/modified microbenchmarks to add gpu-burn python scripts
- modified default and azure_ndv4 configs to add gpu-burn

ff51a3ce

08 Feb, 2022 2 commits
- Benchmarks: Add Feature - Add GDR-only nccl-tests for Nvidia machines (#299) · 433785fd
  Ziyue Yang authored Feb 08, 2022
```
This commit adds GDR-only nccl-tests for Nvidia machines. Also bump NCCL to v2.10.3-1 to achieve peak performance in this test.
```
  433785fd
- Benchmarks: Revise Code - Make data checking in gpu_copy optional (#301) · 682b2c12
  Ziyue Yang authored Feb 08, 2022
```
This commit makes data checking in gpu_copy optional, because it will take too long time if message size is large.
```
  682b2c12
27 Jan, 2022 1 commit
- Config - Disable disk-benchmark in ndmv4.yaml and change batch size to 1 in default.yaml (#292) · f283b536
  Yuting Jiang authored Jan 28, 2022
```
**Description**
Disable disk-benchmark in ndmv4.yaml and change batch size to 1 in default.yaml
```
  f283b536
30 Dec, 2021 1 commit

Release - SuperBench v0.4.0 (#278) · ff563b66

Yifan Xiong authored Dec 30, 2021



__Description__

Cherry-pick  bug fixes from v0.4.0 to main.

__Major Revisions__

* Bug - Fix issues for Ansible and benchmarks (#267)
* Tests - Refine test cases for microbenchmark (#268)
* Bug - Build openmpi with ucx support in rocm dockerfiles (#269)
* Benchmarks: Fix Bug - Fix fio build issue (#272)
* Docs - Unify metric and add doc for cublas and cudnn functions (#271)
* Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274)
* Bug - Fix bug of detecting if gpu_index is none (#275)
* Bug - Fix bugs in data diagnosis (#273)
* Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270)
* Benchmarks: Configuration - Update inference and network benchmarks in configs (#276)
* Docs - Upgrade version and release note (#277)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

ff563b66

13 Dec, 2021 1 commit

Benchmarks: Add Benchmark - Add mlc benchmark to superbench (#216) · b590409e

Hossein Pourreza authored Dec 12, 2021

**Description**
Add mlc memory bandwidth and latency micro benchmark to Superbench.

**Major Revision**
- Add mlc benchmark with test and example files

b590409e

10 Dec, 2021 2 commits

Benchmarks: Add Benchmark - Add ONNXRuntime inference benchmark based on ORT python API (#245) · 4d85630a

guoshzhao authored Dec 10, 2021

**Description**
Add ONNXRuntime inference benchmark based on ORT python API.

**Major Revision**
- Add `ORTInferenceBenchmark` class to export pytorch model to onnx model and do inference
- Add tests and example for `ort-inference` benchmark
- Update the introduction docs.

4d85630a

Monitor: Integration - Integrate monitor into Superbench (#259) · 6e357fb9

guoshzhao authored Dec 10, 2021

**Description**
Integrate monitor into Superbench.

**Major Revision**
- Initialize, start and stop monitor in SB executor.
- Parse the monitor data in SB runner and merge into benchmark results.
- Specify ReduceType for monitor metrics, such as MAX, MIN and LAST.
- Add monitor configs into config file.

6e357fb9

08 Dec, 2021 1 commit

Bug - Fix issues for distributed runs (#258) · 213ab14b

Yifan Xiong authored Dec 08, 2021

Fix issues for distributed runs:
* fix config for memory bandwidth benchmarks
* add throttling for high concurrency docker pull
* update rsync path and exclude directories
* handle exceptions when creating summary
* tune for logging

213ab14b

02 Dec, 2021 1 commit
- Benchmark: Replace `-c` argument with `-N` for `numactl` in Configuration (#250) · b4ea97bf
  Yifan Xiong authored Dec 02, 2021
```
**Description**
Replace `-c` argument with `-N` for `numactl` since the old `-c`/`--cpubind` argument is deprecated.
```
  b4ea97bf
30 Oct, 2021 1 commit

Benchmarks: Add Feature - Add CPU-initiated copy and dtod support to gpu-sm-copy benchmark (#230) · 008e0fe1

Ziyue Yang authored Oct 30, 2021

**Description**
This commit does the following:
1) Adds CPU-initiated copy benchmark;
2) Adds dtod benchmark;
3) Support scanning NUMA nodes and GPUs inside the benchmark program;
4) Change the name of gpu-sm-copy to gpu-copy.

008e0fe1

26 Sep, 2021 1 commit

Release - SuperBench v0.3.0 (#212) · dfbd70b1

Yifan Xiong authored Sep 26, 2021



**Description**

Cherry-pick  bug fixes from v0.3.0 to main.

**Major Revisions**
* Docs - Upgrade version and release note (#209)
* Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
* Benchmarks: Update - Update benchmarks in configuration file (#208)
* CI/CD - Update GitHub Action VM (#211)
* Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
* CI/CD - Fix bug in build image for push event (#205)
* Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
* Tool: Fix bug - Fix function naming issue in system info  (#200)
* CI/CD - Push images in GitHub Action (#202)
* Bug - Fix torch.distributed command for single node (#201)
* CLI - Integrate system info for node (#199)
* Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
* CI/CD - Add ROCm image build in GitHub Actions (#194)
* Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
* Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
* Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
* Bug - Revise 'docker run' in sb deploy (#195)
* Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
Co-authored-by: Yuting Jiang <v-yujiang@microsoft.com>
Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>

dfbd70b1

31 Aug, 2021 1 commit

Benchmarks: Code Revision - Revise metric name generation and default config... · 024a870b

Ziyue Yang authored Aug 31, 2021

Benchmarks: Code Revision - Revise metric name generation and default config for disk performance benchmark (#175)

**Description**
This commit revises disk performance benchmark, including:
1) Add missing benchmark name in default config;
2) Avoid using reserved character ':' in metric name.

024a870b

30 Aug, 2021 1 commit
- Benchmarks: Add Benchmark - Add GPU SM copy benchmark (#169) · b97197f0
  Ziyue Yang authored Aug 30, 2021
```
**Description**
This commit adds gpu_sm_copy benchmark and related tests.
```
  b97197f0
26 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add NCCL performance benchmark (#113) · e083a598

Yuting Jiang authored Jul 26, 2021

**Description**
Add NCCL performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for NCCL

e083a598

23 Jul, 2021 2 commits

Benchmarks: Add Benchmark - Add IB Loopback performance benchmark. (#112) · b0c5addc

Yuting Jiang authored Jul 24, 2021

**Description**
Add RDMA Loopback performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for RDMA Loopback

b0c5addc

Benchmarks: Add Benchmark - Add disk performance benchmark (#132) · db297fb4

Ziyue Yang authored Jul 23, 2021

**Description**
Add disk performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for disk performance.

**Minor Revision**
- Fix bugs in executor unit test related to default enabled tests.

db297fb4

13 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add memory bandwidth benchmark for cuda. (#114) · f9550bd6

Yuting Jiang authored Jul 13, 2021

Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile

f9550bd6

02 Jul, 2021 1 commit
- Docs - Update README and version for v0.2.0 release (#111) · 43620c3f
  Yifan Xiong authored Jul 03, 2021
```
Update README and version for v0.2 release.
```
  43620c3f
23 Jun, 2021 1 commit

Bug bash - Fix bugs in multi GPU benchmarks (#98) · c0c43b8f

Yifan Xiong authored Jun 23, 2021

* Add `sb deploy` command content.
* Fix inline if-expression syntax in playbook.
* Fix quote escape issue in bash command.
* Add custom env in config.
* Update default config for multi GPU benchmarks.
* Update MANIFEST.in to include jinja2 template.
* Require jinja2 minimum version.
* Fix occasional duplicate output in Ansible runner.
* Fix mixed color from Ansible and Python colorlog.
* Update according to comments.
* Change superbench.env from list to dict in config file.

c0c43b8f

16 Jun, 2021 1 commit

Bug bash - Fix bugs and refine log in single GPU benchmarks (#97) · ddbc51a1

Yifan Xiong authored Jun 16, 2021

Fix bugs and refine log in single GPU benchmarks:

* Fix none framework issue
* Fix empty parameter bug
* Remove missed mobilenet_v3 models
* Change benchmark registration log to debug level
* Add pid in logging
* Add missing benchmarks in default config
* Fix deprecated logging warn

ddbc51a1

02 Jun, 2021 1 commit
- Runner - Support local mode in runner (#88) · 6b0ca1cb
  Yifan Xiong authored Jun 02, 2021
```
* Support local mode in runner.
```
  6b0ca1cb
28 May, 2021 1 commit

Runner - Support torch.distributed mode in runner (#81) · 8b4f613a

Yifan Xiong authored May 28, 2021

* Support `torch.distributed` mode in runner.
* Support given `proc_num` and `node_num` in `torch.distributed` mode.

8b4f613a

13 Apr, 2021 1 commit

Executor - Fix issues when executing benchmarks (#51) · 8c527308

Yifan Xiong authored Apr 13, 2021

* fix missing package in dockerfile
* update benchmark list and parameters
* catch runtime errors
* refine logging info

8c527308

09 Apr, 2021 1 commit

Executor: Init - Add superbench executor class (#34) · c74f4879

Yifan Xiong authored Apr 09, 2021

Add superbench executor class

* add executor class
* update default config to exec benchmarks
* add micro benchmarks and model benchmarks

c74f4879

12 Mar, 2021 1 commit

CLI - Add command sb [version,deploy,exec,run] (#10) · 5d11579a

Yifan Xiong authored Mar 12, 2021

- Add CLI commands
  * sb version
  * sb deploy
  * sb exec
  * sb run
- Add interface with executor and runner
- Add cli test cases

5d11579a