Commits · 60a3c74306b2a6359f17cd80562304e542bc720a · tsoc / superbenchmark

15 Jun, 2022 1 commit

Fix cmake and build issues (#360) · 60a3c743

Yifan Xiong authored Jun 15, 2022

**Description**

Fix cmake and build issues.

**Major Revision**

* Remove unnecessary boost build
* Remove user-agent for mlc
* Remove -j for third party to build each project in sequence
* Fix ansible collections installation path

60a3c743

14 Jun, 2022 1 commit

Support `sb run` on host directly without Docker (#358) · a4937e95

Yifan Xiong authored Jun 14, 2022

**Description**

Support `sb run` on host directly without Docker

**Major Revisions**
- Add `--no-docker` argument for `sb run`.
- Run on host directly if `--no-docker` if specified.
- Update docs and tests correspondingly.

a4937e95

06 Jun, 2022 1 commit

Bump eventsource from 1.1.0 to 1.1.1 in /website (#357) · 528d69bd

dependabot[bot] authored Jun 06, 2022

Bumps [eventsource](https://github.com/EventSource/eventsource) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/EventSource/eventsource/releases)
- [Changelog](https://github.com/EventSource/eventsource/blob/master/HISTORY.md)
- [Commits](https://github.com/EventSource/eventsource/compare/v1.1.0...v1.1.1

)

---
updated-dependencies:
- dependency-name: eventsource
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

528d69bd

02 Jun, 2022 2 commits

Bump cross-fetch from 3.1.4 to 3.1.5 in /website (#349) · 77f8048a

dependabot[bot] authored Jun 02, 2022

Bumps [cross-fetch](https://github.com/lquixada/cross-fetch) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/lquixada/cross-fetch/releases)
- [Commits](https://github.com/lquixada/cross-fetch/compare/v3.1.4...v3.1.5

)

---
updated-dependencies:
- dependency-name: cross-fetch
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

77f8048a

Bump async from 2.6.3 to 2.6.4 in /website (#351) · cdd19e6f

dependabot[bot] authored Jun 02, 2022

Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4.
- [Release notes](https://github.com/caolan/async/releases)
- [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md)
- [Commits](https://github.com/caolan/async/compare/v2.6.3...v2.6.4

)

---
updated-dependencies:
- dependency-name: async
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

cdd19e6f

01 Jun, 2022 1 commit

Analyzer - Fix bugs in data diagnosis (#355) · 54da021b

user4543 authored Jun 01, 2022

**Description**
Fix bugs in data diagnosis.

**Major Revision**
- add support to get baseline of the metric which uses custom benchmark naming with ':' like 'nccl-bw:default/allreduce_8_bw:0'
- save raw data of all metrics rather than metrics defined in diagnosis_rules.yaml when output_all is True
- fix bug of using wrong column index when applying format(red color and percentile) in the excel

54da021b

31 May, 2022 1 commit
- Dockerfile - Add support to run sb command inside docker image (#356) · 3f135e46
  user4543 authored Jun 01, 2022
```
**Description**
Add support to run sb command inside docker image - install missing dependency.
```
  3f135e46
27 May, 2022 1 commit
- Dockerfile: Update rccl version and fix issue in rocm5.1.1 dockerfile (#354) · e08b6d3a
  user4543 authored May 27, 2022
```
**Description**
Update rccl version and fix issue in rocm5.1.1 dockerfile.
```
  e08b6d3a
25 May, 2022 1 commit
- Dockerfile - Add dockerfile for rocm5.1.1 (#353) · 81a4146b
  user4543 authored May 25, 2022
```
**Description**
Add dockerfile for rocm5.1.1.
```
  81a4146b
29 Apr, 2022 1 commit

Release - SuperBench v0.5.0 (#350) · 6681c720

Yifan Xiong authored Apr 29, 2022



**Description**

Cherry-pick  bug fixes from v0.5.0 to main.

**Major Revisions**

* Bug - Force to fix ort version as '1.10.0' (#343)
* Bug - Support no matching rules and unify the output name in result_summary (#345)
* Analyzer - Support regex in annotations of benchmark naming for metrics in rules (#344)
* Bug - Fix bugs in sync results on root rank for e2e model benchmarks (#342)
* Bug - Fix bug of duration feature for model benchmarks in distributed mode (#347)
* Docs - Upgrade version and release note (#348)
Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>

6681c720

20 Apr, 2022 1 commit
- Docs - Update links using relative file paths with extensions (#346) · 712eafc3
  user4543 authored Apr 21, 2022
```
**Description**
Update links of referencing other docs using relative file paths with extensions.
```
  712eafc3
15 Apr, 2022 1 commit
- Docs - Update link to cli.md (#341) · cb266911
  Jared Bowden authored Apr 15, 2022
```
**Description**
Fixes relative link in documentation: point to `../cli.md`.
```
  cb266911
11 Apr, 2022 2 commits

Benchmarks: Add Benchmark - Add FAMBench based on docker benchmark (#338) · 80dcc8aa

guoshzhao authored Apr 11, 2022

**Description**
Integrate FAMBench into superbench based on docker implementation:
https://github.com/facebookresearch/FAMBench

The script to run all benchmarks is:
https://github.com/facebookresearch/FAMBench/blob/main/benchmarks/run_all.sh

80dcc8aa

CLI - Integrate output all nodes diagnosis results (#339) · 8dc19ca4
user4543 authored Apr 11, 2022
```
**Description**
Integrate output all nodes diagnosis results.
```
8dc19ca4

10 Apr, 2022 1 commit
- Analyzer: Add Feature - Output results of all nodes in data diagnosis (#336) · 55b0f9d2
  user4543 authored Apr 10, 2022
```
**Description**
Output results of all nodes in data diagnosis.
```
  55b0f9d2
08 Apr, 2022 2 commits

Docs - Add usage for result summary (#337) · 56c9a711
user4543 authored Apr 09, 2022
```
**Description**
Add usage for result summary.
```
56c9a711

CLI - Integrage result summary and update output format of data diagnosis (#335) · f15da60b

user4543 authored Apr 08, 2022

**Description**
Integrage result summary and update output format of data diagnosis.

**Major Revision**
- integrage result summary 
- add md and html format for data diagnosis

f15da60b

01 Apr, 2022 1 commit

Benchmarks: Add Feature - Provide option to save raw data into file. (#333) · 6d895da8

guoshzhao authored Apr 01, 2022

**Description**
Use config `log_raw_data` to control whether log the raw data into file or not. The default value is `no`. We can set it as `yes` for some particular benchmarks to save the raw data into file, such as NCCL/RCCL test.

6d895da8

31 Mar, 2022 1 commit

Bump minimist from 1.2.5 to 1.2.6 in /website (#334) · d368d90e

dependabot[bot] authored Mar 31, 2022

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6

)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d368d90e

24 Mar, 2022 1 commit

Analyzer: Add feature - Add result summary in excel,md,html format (#320) · 84fed1ce

user4543 authored Mar 24, 2022

**Description**
Add result summary in excel,md,html format.

**Major Revision**
- Add ResultSummary class to support result summary in excel,md,html format.
- Abstract RuleBase class for common-used functions in DataDiagnosis and ResultSummary.

84fed1ce

22 Mar, 2022 1 commit
- Bug: Benchmarks - remove fp16 samples type converting time (#332) · c5aa4f4e
  user4543 authored Mar 22, 2022
```
**Description**
Remove fp16 samples type converting time for training cnn and lstm inference.
```
  c5aa4f4e
21 Mar, 2022 1 commit

Config - Add inference config for NC A100 and NV A10 series (#329) · a9634ef5

Yifan Xiong authored Mar 21, 2022

Add inference config for preview SKUs, including:
* [NC96ads_A100_v4](https://docs.microsoft.com/en-us/azure/virtual-machines/nc-a100-v4-series)
* [NV18ads_A10_v5](https://docs.microsoft.com/en-us/azure/virtual-machines/nva10v5-series)

a9634ef5

17 Mar, 2022 1 commit
- Bug: Benchmarks - remove fp16 samples type converting time for cnn and lstm models (#330) · 6e749180
  user4543 authored Mar 17, 2022
```
**Description**
Remove fp16  samples type converting time for cnn and lstm models.
```
  6e749180
16 Mar, 2022 1 commit

Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce

rafsalas19 authored Mar 16, 2022

**Description**
Modifications adding GPU-Burn to SuperBench.
- added third party submodule
- modified Makefile to make gpu-burn binary
- added/modified microbenchmarks to add gpu-burn python scripts
- modified default and azure_ndv4 configs to add gpu-burn

ff51a3ce

15 Mar, 2022 2 commits

Bug: Executor - fix bug in result writing to files for mpi mode (#328) · 84359fd8
user4543 authored Mar 16, 2022
```
**Description**
fix the bug in result writing to files for mpi mode.
```
84359fd8

Analyzer - Add md and html output format for DataDiagnosis (#325) · b3c95f18

user4543 authored Mar 15, 2022

**Description**
Add md and html output format for DataDiagnosis.

**Major Revision**
- add md and html support in file_handler
- add interface in DataDiagnosis for md and HTML output

**Minor Revision**
- move excel and json output interface into DataDiagnosis

b3c95f18

09 Mar, 2022 1 commit

Bug - Fix env path to absolute path (#327) · f755c0b6

Yifan Xiong authored Mar 09, 2022

Fix env file path to absolute path in `docker exec`, in case there're mixed ssh and local connections or different users are used.

f755c0b6

07 Mar, 2022 2 commits

Analyzer: Revise - Abstract RuleBase from DataDiagnosis (#321) · 1ec055e1
user4543 authored Mar 07, 2022
```
**Description**
Abstract RuleBase from DataDiagnosis.
```
1ec055e1

Bump url-parse from 1.5.8 to 1.5.10 in /website (#323) · 97595271

dependabot[bot] authored Mar 07, 2022

Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.5.8 to 1.5.10.
- [Release notes](https://github.com/unshiftio/url-parse/releases)
- [Commits](https://github.com/unshiftio/url-parse/compare/1.5.8...1.5.10

)

---
updated-dependencies:
- dependency-name: url-parse
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

97595271

06 Mar, 2022 1 commit

Benchmarks - Keep BatchNorm as fp32 for pytorch cnn models cast to fp16 (#322) · a9ef0f99

Jeff Daily authored Mar 06, 2022

**Description**
The BatchNorm operator is not numerically stable in fp16.  PyTorch documentation recommends to keep the BN op in fp32 for fp16 AMP models.  Refer to https://pytorch.org/docs/stable/amp.html#ops-that-can-autocast-to-float32.  Preserving BN in fp32 for superbench more accurately reflects real workloads.

a9ef0f99

28 Feb, 2022 2 commits

Dockerfile - Add dockerfile for rocm5.0.1 (#319) · 425b9ff8
user4543 authored Feb 28, 2022
```
**Description**
Add dockerfile for rocm5.0.1.
```
425b9ff8

Bump prismjs from 1.23.0 to 1.27.0 in /website (#318) · 74a3b123

dependabot[bot] authored Feb 28, 2022

Bumps [prismjs](https://github.com/PrismJS/prism) from 1.23.0 to 1.27.0.
- [Release notes](https://github.com/PrismJS/prism/releases)
- [Changelog](https://github.com/PrismJS/prism/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PrismJS/prism/compare/v1.23.0...v1.27.0

)

---
updated-dependencies:
- dependency-name: prismjs
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

74a3b123

25 Feb, 2022 1 commit
- Dockerfile - Add rocm5.0 dockerfile (#307) · a4950a70
  user4543 authored Feb 26, 2022
```
**Description**
Add rocm5.0 dockerfile.
```
  a4950a70
24 Feb, 2022 2 commits
- Bug Fix - Fix P2P detection in gpu_copy (#317) · 01304706
  Ziyue Yang authored Feb 25, 2022
```
**Description**
Fix invalid reference of P2P detection result in gpu_copy.
```
  01304706
- Benchmarks: Build Pipeline - Make gpcnet only for cuda (#316) · 4f5027db
  user4543 authored Feb 24, 2022
```
**Description**
Make gpcnet only for cuda.
```
  4f5027db
22 Feb, 2022 1 commit

Bug - Fix empty HIP_ARCHITECTURES issue in cmake>=3.21.0 (#315) · e0c49142

user4543 authored Feb 22, 2022

**Description**
Fix HIP_ARCHITECTURES is empty issue with cmake>=3.21.0.
Refer to https://github.com/ROCm-Developer-Tools/HIP/pull/2364

e0c49142

21 Feb, 2022 1 commit

Bump url-parse from 1.5.1 to 1.5.8 in /website (#313) · 0740780b

dependabot[bot] authored Feb 21, 2022

Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.5.1 to 1.5.8.
- [Release notes](https://github.com/unshiftio/url-parse/releases)
- [Commits](https://github.com/unshiftio/url-parse/compare/1.5.1...1.5.8

)

---
updated-dependencies:
- dependency-name: url-parse
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

0740780b

20 Feb, 2022 2 commits

Config - Add T4 configurations for inference (#311) · ea2c10ab
Yifan Xiong authored Feb 20, 2022
```
Add T4 configurations for inference.
```
ea2c10ab

Analyzer: Add Feature - Add multi-rules feature for data diagnosis (#289) · 97ed12f9

user4543 authored Feb 20, 2022

**Description**
Add multi-rules feature for data diagnosis to support multiple rules' combined check.

**Major Revision**
- revise rule design to support multiple rules combination check
- update related codes and tests

97ed12f9

15 Feb, 2022 1 commit
- Bug - Fix env file path (#310) · 1f48268b
  Yifan Xiong authored Feb 15, 2022
```
Fix env file path for `docker run`.
```
  1f48268b