Commits · 41a484fa70085c4a2333cd0aa96b18c91d7310b4 · tsoc / superbenchmark

15 Feb, 2025 1 commit

Bugfix: Avoid Unintended nvbandwidth Function Calls in All Benchmarks (#685) · 41a484fa

Hongtao Zhang authored Feb 14, 2025



Root Cause:

1. '_get_all_test_cases()' was called in '_parser' while '_parser' was
defined in the base class.
2.  in '_get_all_test_cases()', cmd path was not included.

Fix:

1. Remove '_get_all_test_cases()' from '_parser'.
2. Construct path for cmd.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

41a484fa

05 Feb, 2025 2 commits

Bugfix - nvbandwidth benchmark need to handle N/A value (#675) · 45d06647

Hongtao Zhang authored Feb 05, 2025



**Description**

1. Fixed the bug that nvbandwidth benchmark need to handle 'N/A' values
in nvbandwidth cmd output.
2. Replaced the input format of test cases with a list.
3. Add nvbandwidth configuration example in default config files.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

45d06647

Bug - Fix tensorrt-inference parsing (#674) · 7af7c0b7

Kirill Prosvirov authored Feb 05, 2025

**Description**
Today I was running a benchmark on my machine. And encountered a fancy
issue with tensorrt-inference.
I got code 33, which according to the source code is:
```
MICROBENCHMARK_RESULT_PARSING_FAILURE = 33
```
I dived deep into the code and found out the following problem. The
parser stumbled upon getting to the following line:
```
[11/28/2024-17:03:11] [I] Latency: min = 7.2793 ms, max = 10.1606 ms, mean = 7.41642 ms, median = 7.39551 ms, percentile(99%) = 8 ms
```
I ran it separately on the code and found out that the regular
expression was not suitable for the cases like this, when you encounter
an INT as a result in milliseconds.
That's why this pull request is created.
I came up with the closest possible regular expression to fix this issue
and not to introduce any other bug.

**Major Revision**
- 0.11.0

7af7c0b7

04 Feb, 2025 3 commits

Update Flake8 repo (#683) · b55279ad

pdr authored Feb 04, 2025

Flake8 has moved away from gitlab to github.
Updating the repo path in the pre commit config.

b55279ad

Microbenchmark - Add arch support for 10.0 in gemm-flops (#680) · 1d09b111
Hongtao Zhang authored Feb 03, 2025
```
**Description**
Introduce architecture support for version 10.0 in gemm-flops.
```
1d09b111

Setup - Fix installation and lint issues (#684) · 424f7b5b

Yifan Xiong authored Feb 03, 2025

Fix installation and lint issues:

* Fix transformer installation in Python3.7 due to upgrade of safetensors.
* Fix lint issues in mypy 1.14.1.

424f7b5b

08 Jan, 2025 1 commit

Bump nanoid from 3.3.6 to 3.3.8 in /website (#678) · 060f4f82

dependabot[bot] authored Jan 07, 2025

Bumps [nanoid](https://github.com/ai/nanoid) from 3.3.6 to 3.3.8.
- [Release notes](https://github.com/ai/nanoid/releases)
- [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md

)
- [Commits](ai/nanoid@3.3.6...3.3.8)

---
updated-dependencies:
- dependency-name: nanoid
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

060f4f82

28 Nov, 2024 2 commits

Benchmarks - Add LLaMA-2 Models (#668) · 249e21c1

pdr authored Nov 27, 2024

Added llama benchmark - training and inference in accordance with the
existing pytorch models implementation like gpt2, lstm etc.

- added llama fp8 unit test for better code coverage, to reduce memory
required
- updated transformers version >= 4.28.0 for LLamaConfig
- set tokenizers version <= 0.20.3 to avoid 0.20.4 version
[issues](https://github.com/huggingface/tokenizers/issues/1691

) with
py3.8
- added llama2 to tensorrt
- llama2 tests not added to test_tensorrt_inference_performance.py due
to large memory requirement for worker gpu. tests validated separately
on gh200

---------
Co-authored-by: dpatlolla <dpatlolla@microsoft.com>

249e21c1

Bug Fix - Fix stderr message in gpu-copy benchmark (#673) · 4e6935ab
pdr authored Nov 27, 2024
```
Fix ordering of args in err messages.
```
4e6935ab

27 Nov, 2024 1 commit

CI/CD - Upgrade dependency versions in pipeline (#671) · 96f5ccea

Yifan Xiong authored Nov 26, 2024



Upgrade dependency versions in Azure pipeline:

* Remove Python 3.6 and add Python 3.10 for cpu-unit-test
* Upgrade CUDA from 11.1 to 12.4 for cuda-unit-test
* Update labels accordingly

---------
Co-authored-by: Dilip Patlolla <dilipreddi@gmail.com>

96f5ccea

22 Nov, 2024 1 commit

Benchmarks: micro benchmarks - add nvbandwidth benchmark (#669) · 7cef624e

Hongtao Zhang authored Nov 21, 2024



**Description**

Add nvbandwidth benchmark.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

7cef624e

21 Nov, 2024 2 commits
- Benchmarks: micro benchmarks - add nvbandwidth build (#665) · c8c52eb2
  Hongtao Zhang authored Nov 21, 2024
```
**Description**
Add nvbandwidth build to repo

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
```
  c8c52eb2
- Docs - Update CODEOWNERS (#670) · 54eeac25
  Yifan Xiong authored Nov 20, 2024
```
Update CODEOWNERS for docs.
```
  54eeac25
20 Nov, 2024 1 commit

Benchmarks: micro benchmarks - add general CPU bandwidth and latency benchmark (#662) · 9c35e80a

Hongtao Zhang authored Nov 20, 2024



**Description**
Add micro benchmark to measure general CPU bandwidth and latency without 'mlc'.

Test output:
```
{
"cpu-memory-bw-latency/return_code": 0,
"cpu-memory-bw-latency/mem_bandwidth_matrix_numa_0_1_bw": 5388.75021,
"cpu-memory-bw-latency/mem_bandwidth_matrix_numa_0_1_lat": 0.185571786,
"cpu-memory-bw-latency/mem_bandwidth_matrix_numa_1_0_bw": 4634.82028,
"cpu-memory-bw-latency/mem_bandwidth_matrix_numa_1_0_lat": 0.215758096,
}
```

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

9c35e80a

15 Nov, 2024 1 commit

Dependency - Bump onnxruntime-gpu version from 1.10.0 to 1.12.0 (#663) · a8a7bed2

Hongtao Zhang authored Nov 14, 2024



**Description**

Bump onnxruntime-gpu from 1.10.0 to 1.12.0.

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>

a8a7bed2

07 Nov, 2024 2 commits

Bump webpack from 5.76.1 to 5.96.1 in /website (#661) · 83ee4eba

dependabot[bot] authored Nov 07, 2024

Bumps [webpack](https://github.com/webpack/webpack) from 5.76.1 to 5.96.1.
- [Release notes](https://github.com/webpack/webpack/releases

)
- [Commits](webpack/webpack@v5.76.1...v5.96.1)

---
updated-dependencies:
- dependency-name: webpack
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

83ee4eba

Bump cookie and express in /website (#655) · c9b2b455

dependabot[bot] authored Nov 07, 2024

Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](jshttp/cookie@v0.6.0...v0.7.1)

Updates `express` from 4.21.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md

)
- [Commits](expressjs/express@4.21.0...4.21.1)

---
updated-dependencies:
- dependency-name: cookie
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

c9b2b455

06 Nov, 2024 1 commit

Dockerfile - Add support for arm64 build (#660) · 47949127

pdr authored Nov 06, 2024

Add support for arm64 build:

- Updated dockerfile for arm64 build
- extend cpu stream compilation for neoverse 
- handle onnxruntime-gpu installation
- third party builds filtering based on arch
- disable cuda decode perf build for non x86

47949127

05 Nov, 2024 1 commit

Bug Fix - Fix numa error on grace cpu in gpu-copy (#658) · 59d36f7f

pdr authored Nov 05, 2024

The current GPU Copy BW Performance fails on Nvidia Grace systems. This
is due to the memory only numa node and thus the numa_run_on_node fails
for such nodes and halts completely.

This fix checks for the presence of assigned CPU cores for the numa
node, on checking if it has no cpu cores assigned, it skips that
specific node during the args creation and continues.

59d36f7f

02 Nov, 2024 1 commit

CI/CD - Update Image Build Pipeline (#659) · 61770b89

Yifan Xiong authored Nov 01, 2024

**Description**

Update image build.

**Major Revision**

* Remove ROCm 6.0 image due to outdated packages
* Remove build tag for ROCm
* Preserve build cache for 30 days

61770b89

10 Oct, 2024 1 commit

Release - SuperBench v0.11.0 (#654) · 949f9cb4

Yuting Jiang authored Oct 10, 2024



**Description**
Cherry pick bug fixes from v0.11.0 to main

**Major Revision**
* #645 
* #648 
* #646 
* #647 
* #651 
* #652 
* #650

---------
Co-authored-by: hongtaozhang <hongtaozhang@microsoft.com>
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>

949f9cb4

19 Sep, 2024 1 commit

Bump serve-static and express in /website (#643) · 9f3231e9

dependabot[bot] authored Sep 20, 2024

Bumps [serve-static](https://github.com/expressjs/serve-static) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `serve-static` from 1.15.0 to 1.16.2
- [Release notes](https://github.com/expressjs/serve-static/releases)
- [Changelog](https://github.com/expressjs/serve-static/blob/v1.16.2/HISTORY.md)
- [Commits](expressjs/serve-static@v1.15.0...v1.16.2)

Updates `express` from 4.19.2 to 4.21.0
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md

)
- [Commits](expressjs/express@4.19.2...4.21.0)

---
updated-dependencies:
- dependency-name: serve-static
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

9f3231e9

20 Aug, 2024 1 commit
- Bug: Executor - Fix executor for Benchmark Execution Without Explicit Framework Field (#636) · 96cc4d93
  Yang Wang authored Aug 21, 2024
```
**Description**
Fix executor for Benchmark Execution Without Explicit Framework Field
```
  96cc4d93
16 Aug, 2024 1 commit

Bug Fix: Data Diagnosis - Fix bug of failure test and warning of pandas in data diagnosis (#638) · 7af75df3

Yuting Jiang authored Aug 16, 2024

**Description**
Fix bug of failure test and warning of pandas in data diagnosis.

**Major Revision**
- fix warning of pandas in replace and fillna due to type downcast
- fix bug of failure check function only check one matched metric rather
than all matched metrics
- fix bug when converting regex into str of metrics when there're more
than one match group

7af75df3

13 Aug, 2024 1 commit
- Bug Fix - Update Docker Exec Command for Persistent HPCX Environment (#635) · 46a57929
  Yang Wang authored Aug 14, 2024
```
Add 10-hpcx.sh to /etc/profile.d
Update the Docker exec command to ensure a persistent HPCX environment.
```
  46a57929
08 Aug, 2024 1 commit
- Use `types-setuptools` as `types-pkg_resources` is Yanked (#637) · 9de841bc
  Yang Wang authored Aug 08, 2024
```
* https://pypi.org/project/types-pkg-resources/
* Use types-setuptools instead
```
  9de841bc
28 Jul, 2024 1 commit
- CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline (#633) · 2101e933
  Yuting Jiang authored Jul 29, 2024
```
**Description**
Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM
issue.
```
  2101e933
26 Jul, 2024 2 commits

Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634) · e304cf15
Yuting Jiang authored Jul 26, 2024
```
**Description**
Add support GPU ARCH 8.9 for NVIDIA L4/L40/L40s GPUs in gemm-flops.
```
e304cf15

Bump express from 4.18.2 to 4.19.2 in /website (#618) · 4e27142a

dependabot[bot] authored Jul 26, 2024

Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md

)
- [Commits](expressjs/express@4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

4e27142a

25 Jul, 2024 3 commits

Bump ws from 6.2.2 to 6.2.3 in /website (#629) · b4945fb2

dependabot[bot] authored Jul 25, 2024

Bumps [ws](https://github.com/websockets/ws) from 6.2.2 to 6.2.3.
- [Release notes](https://github.com/websockets/ws/releases

)
- [Commits](websockets/ws@6.2.2...6.2.3)

---
updated-dependencies:
- dependency-name: ws
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

b4945fb2

Docs - fix typos (#628) · a4c87da0
omahs authored Jul 25, 2024
```
Docs - fix typos
```
a4c87da0

Bump ip from 1.1.5 to 1.1.9 in /website (#610) · 4102302a

dependabot[bot] authored Jul 25, 2024

Bumps [ip](https://github.com/indutny/node-ip

) from 1.1.5 to 1.1.9.
- [Commits](indutny/node-ip@v1.1.5...v1.1.9)

---
updated-dependencies:
- dependency-name: ip
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

4102302a

24 Jul, 2024 2 commits

Bump follow-redirects from 1.14.8 to 1.15.6 in /website (#613) · 6e556d76

dependabot[bot] authored Jul 24, 2024

Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.8 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases

)
- [Commits](follow-redirects/follow-redirects@v1.14.8...v1.15.6)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>

6e556d76

Docs - Add BibTeX in README and repo (#632) · 1362732c
Yifan Xiong authored Jul 23, 2024
```
Add BibTeX for citation in README and repo.
```
1362732c

23 Jul, 2024 1 commit

Update omegaconf version to 2.3.0 (#631) · 9a3ce39d

Yang Wang authored Jul 24, 2024

Update `omegaconf` version to
[2.3.0](https://pypi.org/project/omegaconf/2.3.0/) as omegaconf 2.0.6
has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will
enforce this behaviour change.
Discussion can be found at https://github.com/pypa/pip/issues/12063.

9a3ce39d

22 Apr, 2024 1 commit

Dockerfile - Add CUDA 12.4 dockerfile (#619) · 7435f10a

Yuting Jiang authored Apr 22, 2024

**Description**
Add CUDA 12.4 dockerfile.

**Major Revision**
- upgrade nvidia docker into 23.04


**Minor Revision**
- upgrade hpcx into 2.18

7435f10a

18 Apr, 2024 1 commit
- Dockerfile - Upgrade mlc to v3.11 (#620) · dc3846cb
  Yuting Jiang authored Apr 18, 2024
```
**Description**
Upgrade mlc to v3.11.
```
  dc3846cb
02 Apr, 2024 1 commit
- Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation (#616) · cc89ee59
  Ziyue Yang authored Apr 02, 2024
```
**Description**
Adds hipblasLt tuning to dist-inference cpp implementation.
```
  cc89ee59
21 Mar, 2024 1 commit

Bug Fix - Bug fix for cuda 12.2 dockerfile LD_LIBRARY_PATH issue (#614) · eeaa9b1a

Yang Wang authored Mar 22, 2024

**Description**
Cuda 12.2 image will report undfined symbol error due to incomplete
LD_LIBRARY_PATH:


![image](https://github.com/microsoft/superbenchmark/assets/25875482/1a7c48c7-cb6b-4e3a-abbe-dde23007a96b)

### How to reproduce:
1. Deploy sb with cuda12.2 image
```
sb deploy -f local.ini -i superbench/superbench:v0.10.0-cuda12.2
```
2. Enter to the container
```
sudo docker exec -it sb-workspace bash
```
3. Execute `mpirun`:
```
root@sb-container:~# mpirun
mpirun: symbol lookup error: mpirun: undefined symbol: opal_libevent2022_event_base_loop
```
### Fix to fix
* Append hpcx_load into /etc/bash.bashrc for updaing env LD_LIBRARY_PATH in each time

---------

eeaa9b1a

08 Jan, 2024 1 commit

Release - SuperBench v0.10.0 (#607) · 2c88db90

Yifan Xiong authored Jan 07, 2024



**Description**

Cherry-pick bug fixes from v0.10.0 to main.

**Major Revisions**

* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - Upgrade pyrsmi to amdsmi python library. #601
* Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
* Dockerfile - Add rocm6.0 dockerfile #602
* Bug Fix - Bug fix for latest megatron-lm benchmark #600
* Docs - Upgrade version and release note #606
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>

2c88db90