Commits · f91f97b60707f0d17aa16578c1719bea6b601062 · tsoc / superbenchmark

16 Sep, 2021 1 commit
- CLI - Integrate system info for node (#199) · f91f97b6
  Yifan Xiong authored Sep 16, 2021
```
Integrate system info for node, add `sb node info` command.
```
  f91f97b6
14 Sep, 2021 1 commit

Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196) · ff487387

guoshzhao authored Sep 14, 2021

**Description**
1. Do `enable_language(CUDA)` before using `CMAKE_CUDA_COMPILER_VERSION`
2. use `cmake --install` to install target which will call `cmake -P cmake_install.cmake` instead of `make Makefile` to avoid issue `make: *** No rule to make target 'install'.  Stop.`

ff487387

13 Sep, 2021 5 commits

CI/CD - Add ROCm image build in GitHub Actions (#194) · 7656f329
Yifan Xiong authored Sep 13, 2021
```
Add ROCm image build in GitHub Actions.
```
7656f329

Bug: Fix bug - fix bug of hipBusBandwidth build (#193) · 7e48ad34

Yuting Jiang authored Sep 13, 2021

**Description**
fix bug of hipBusBandwidth building

**Major Revision**
- it failed to enter the check 'hip/samples/1_Utils/hipBusBandwidth/CMakeLists.txt' when building docker, so removed this check
- add sb_micro_path for rocm_bandwidthTest

7e48ad34

Benchmarks: Build Pipeline - Restore rocblas build logic (#197) · ee5c7662

Yuting Jiang authored Sep 13, 2021

**Description**
 restore rocblas build logic to cancel support of rocblas build in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 base image.

**Major Revision**
-  restore rocblas build logic, remove gpu target limit and other resource limit for rocm4.0.

ee5c7662

Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198) · 7a3a4502

Yuting Jiang authored Sep 13, 2021

**Description**
Add barrier before 'destroy_process_group' to resolve the bug due to when multi models in one model benchmark, some processes haven't finished the previous process group while others failed to initialize new process group for the next model on rocm4.x when running bert_models.

**Major Revision**
-  Add barrier before 'destroy_process_group'.

7a3a4502

Bug - Revise 'docker run' in sb deploy (#195) · 1f9de77f

Yuting Jiang authored Sep 13, 2021

**Description**

Revise 'docker run' in sb deploy due to base image running endpoint/cmd under /root.

**Major Revision**

- define endpoint bash when 'docker run'

1f9de77f

09 Sep, 2021 1 commit
- Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190) · 14232b56
  Yuting Jiang authored Sep 09, 2021
```
**Description**
fix bug of error param opterations of rccl-bw in hpe MI100 config

**Major Revision**
- operations->operation
```
  14232b56
06 Sep, 2021 1 commit

Tools: Add Feature - Add script to generate system config info. (#160) · 37b15db9

Yuting Jiang authored Sep 06, 2021

**Description**
Add script to generate system config info.

**Major Revision**
- Add script to generate system config info into the dict in superbench/tools.

37b15db9

03 Sep, 2021 1 commit

Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode... · 60762518

Yuting Jiang authored Sep 03, 2021

Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode and rename metric (#189)

**Description**
Revise arguments of nccl/rccl to support mpi mode for (mpi can not run in nccl/rccl due to multiple operators run in sequence without barrier) and rename metric .

**Major Revision**
- revise argument operators to be a single one

**Minor Revision**
- rename metric to remove benchmark name info
- change argument ngpus default value to be 1

60762518

02 Sep, 2021 6 commits

Dockerfile - Fix ulimit nofile in Docker images (#183) · 4e431f11

Yifan Xiong authored Sep 02, 2021

__Description__

Resolve "too many open files" issue when runnning NCCL/RCCL on multiple nodes using Docker images, increase nofile number in limits.conf.

4e431f11

Benchmarks: Fix bug - Fix missing key error in disk performance benchmark (#188) · b79e2845
Ziyue Yang authored Sep 02, 2021
```
**Description**
This commit fixes error of missing key 'percentile' in parsing FIO result.
```
b79e2845

Benchmarks: Add Configuration - Add microbenchmark in the validation config... · 47daedbe

Yuting Jiang authored Sep 02, 2021

Benchmarks: Add Configuration - Add microbenchmark in the validation config file for HPE (AMD MI00) (#176)

**Description**
Add microbenchmark in the validation config file for AMD MI00.

**Major Revision**
- add rccl-bw, mem-bw,ib-loopback,gemm-flops,kernel-launch config for mi100

47daedbe

Docs - Support docsearch in website (#184) · 2ebb44cc
Yifan Xiong authored Sep 02, 2021
```
Support docsearch in website, powered by [Algolia](https://docsearch.algolia.com).
```
2ebb44cc

Runner - Fix inventory issue in ansible_runner (#185) · e2453e1c

Yifan Xiong authored Sep 02, 2021

__Description__

Fix inventory bug in ansible_runner when host list is provided with multiple hosts.

It ought to be handled by ansible_runner lib, workaround by using `--inventory` arg in cmdline.

e2453e1c

Docs: Add system config info for result collection (#168) · ab71bbb4
TobeyQin authored Sep 02, 2021
```
**Description**
Add system config info for result collection
```
ab71bbb4

01 Sep, 2021 3 commits

Benchmarks: Code Revision - revise the DockerBenchmark base class (#179) · 37d5dfd5

guoshzhao authored Sep 01, 2021

**Description**
Revise the DockerBenchmark base to support image pull, image rm etc.

**Major Revision**
- image pull in _preprocess()
- image clean in _postprocess()
- execute customized commands in _benchmark()
- add unit tests

37d5dfd5

Dockerfile: Add Package - Install openmpi for ROCm images (#181) · 115cd2e6
guoshzhao authored Sep 01, 2021
```
**Description**
Install openmpi-4.0.0 for ROCm images.
```
115cd2e6

Benchmarks: Docker Benchmarks - Setup Docker-in-Docker environment (#180) · 7d947757

guoshzhao authored Sep 01, 2021

**Description**
Setup docker environment in docker container.

**Major Revision**
- Install docker client for cuda and rocm images.
- Mount /var/run/docker.sock from host

7d947757

31 Aug, 2021 5 commits

Benchmarks: Build Pipeline - Support rocblas building in... · b90b47f3

Yuting Jiang authored Sep 01, 2021

Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172)

**Description**
Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker.

**Major Revision**
- add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used

**Minor Revision**
- make rocm_version to be able to modify

b90b47f3

Benchmarks: Code Revision - Revise metric name generation and default config... · 024a870b

Ziyue Yang authored Aug 31, 2021

Benchmarks: Code Revision - Revise metric name generation and default config for disk performance benchmark (#175)

**Description**
This commit revises disk performance benchmark, including:
1) Add missing benchmark name in default config;
2) Avoid using reserved character ':' in metric name.

024a870b

Dockerfile: Add dockerfile - Add rocm 4.0 and 4.2 dockerfile with pytorch1.7.0 (#164) · a7f508e4
guoshzhao authored Aug 31, 2021
```
**Description**
Add dockerfile `rocm4.0-pytorch1.7.0.dockerfile` and `rocm4.2-pytorch1.7.0.dockerfile` for `rocm` platform.
```
a7f508e4

Setup: Revision - Revise torch extra_require (#177) · c8357f4e

guoshzhao authored Aug 31, 2021

**Description**
change the minimal version requirement for superbench:
```
'torch>=1.7.0a0',
'torchvision>=0.8.0a0',
```

c8357f4e

Benchmarks: Code Revision - Revise subprocess invoke (#178) · 8cd264fd
guoshzhao authored Aug 31, 2021
```
**Description**
Package frequently-used subprocess invoke into function.
```
8cd264fd

30 Aug, 2021 6 commits

Benchmarks: Add Benchmark - Add GPU SM copy benchmark (#169) · b97197f0
Ziyue Yang authored Aug 30, 2021
```
**Description**
This commit adds gpu_sm_copy benchmark and related tests.
```
b97197f0

Docs: Revision - Revise results contributing rule (#174) · de481cb0

TobeyQin authored Aug 30, 2021

**Description**
Revise results contributing rule.

- Change the results uploading path to [superbench-results](https://github.com/microsoft/superbench-results

) repo.
- Add description of how to get system info by command.
Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>

de481cb0

Docs: Add document for SuperBench YAML config (#158) · 0b74b2aa
Yifan Xiong authored Aug 30, 2021
```
**Description**
Add document for SuperBench YAML config file.
```
0b74b2aa

Benchmarks: Fix Bug - Remove ib device port info in command to fix bug of ib loopback (#173) · 95c9fc95

Yuting Jiang authored Aug 30, 2021

**Description**
Remove IB device port info in command to fix bug of IB loopback.

**Major Revision**
- Remove IB device port info in command to fix bug of IB loopback

95c9fc95

Benchmarks: Add Benchmark - Add gemm flops microbenchmark for amd (#152) · f3d53c3d

Yuting Jiang authored Aug 30, 2021

**Description**
Add gemm flops microbenchmark for amd.

**Major Revision**
- Add gemm flops microbenchmark for amd.
- Add related example and test file.

f3d53c3d

Benchmarks: Code Revision - Extract base class for gemm flops microbenchmark (#165) · b0df66f7

Yuting Jiang authored Aug 30, 2021

**Description**
Extract base class for gemm flops microbenchmark.

**Major Revision**
- extract base class for gemm flops microbenchmark and add related test.
- revise gemm_flops_performance for cuda.

b0df66f7

27 Aug, 2021 4 commits

Benchmarks: Code Revision - Rename kernel_launch_overhead metrics (#171) · 35114bae

guoshzhao authored Aug 28, 2021

**Description**
Rename `kernel_launch_overhead_event` to `event_overhead`, `kernel_launch_overhead_wall` to `wall_overhead`.

35114bae

Benchmarks: Add Benchmark - Add memory bus bandwidth performance microbenchmark for amd (#153) · 666e3a94

Yuting Jiang authored Aug 27, 2021

**Description**
Add memory bus bandwidth performance microbenchmark for amd.

**Major Revision**
- Add memory bus bandwidth performance microbenchmark for amd.
- Add related example and test file.

666e3a94

Benchmarks: Add Benchmark - Add GPU SM copy benchmark (#162) · 2880f71e
Ziyue Yang authored Aug 27, 2021
```
**Description**
This commit adds the benchmark program for GPU-initiated data transfer benchmark.
```
2880f71e

Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for... · 958ebc0e

Yuting Jiang authored Aug 27, 2021

Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for amd in build pipeline (#166)

**Description**
Fix bug of microbenmark building cublas and cudnn for amd

**Major Revision**
- remove cuda LANGUAGES in project()
- check CUDAToolkit quiet and then build if found

958ebc0e

26 Aug, 2021 1 commit

Benchmarks: Code Revision - Rename computation_communication_overlap microbenchmark metric (#167) · 34cd2e8c

Yuting Jiang authored Aug 26, 2021

**Description**
Rename computation_communication_overlap microbenchmark metric .

**Major Revision**
- remove rank info in metric.
- simplify and rename metric.

34cd2e8c

25 Aug, 2021 1 commit

Benchmarks: Code Revision - Extract base class for memory bandwidth microbenchmark (#159) · e5e84a2e

Yuting Jiang authored Aug 26, 2021

**Description**
extract base class for memory bandwidth microbenchmark.

**Major Revision**
- revise and optimize cuda_memory_bandwidth_performance
- extract base class for memory bandwidth microbenchmark
- add test for base class

e5e84a2e

23 Aug, 2021 1 commit
- Benchmarks: Code Revision - fix typo in test of nccl microbenchmark. (#163) · 0583862d
  Yuting Jiang authored Aug 23, 2021
```
**Description**
 fix typo in test_nccl_bw_performance.py.

**Major Revision**
-  fix typo in test_nccl_bw_performance.py.
```
  0583862d
22 Aug, 2021 1 commit
- Benchmarks: Revise Benchmark - Add readwrite I/O pattern (#161) · 6774d7b7
  Ziyue Yang authored Aug 22, 2021
```
**Description**
This commit adds readwrite I/O pattern for FIO benchmark. Read/write ratio is fixed at 4:1.
```
  6774d7b7
20 Aug, 2021 2 commits

Runner: Add Feature - Generate summarized output files. (#157) · 7595d794

guoshzhao authored Aug 20, 2021

**Description**
Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`

**Major Revision**
- Generate the summarized json file per node:
For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
`[]` means optional.
```
{
  "kernel-launch/overhead_event:0": 0.00583,
  "kernel-launch/overhead_event:1": 0.00545,
  "kernel-launch/overhead_event:2": 0.00581,
  "kernel-launch/overhead_event:3": 0.00572,
  "kernel-launch/overhead_event:4": 0.00559,
  "kernel-launch/overhead_event:5": 0.00591,
  "kernel-launch/overhead_event:6": 0.00562,
  "kernel-launch/overhead_event:7": 0.00586,
  "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
  "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
  "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
  "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
  "pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
  "pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
  "pytorch-sharding-matmul/0/allgather": 10.088025093078613,
  "pytorch-sharding-matmul/1/allgather": 10.088025093078613
}
```
- Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.

7595d794

Benchmarks: Build Pipeline - Add build logic of hipBusBandwidth in third_party (#151) · a1e5c90d

Yuting Jiang authored Aug 20, 2021

**Description**
Add build logic of hipBusBandwidth in third_party.

**Major Revision**
- Add build logic of hipBusBandwidth in third_party

a1e5c90d