Commits · 7a3a450237c3e8e321622f1de6ace0ac761aa69d · tsoc / superbenchmark

13 Sep, 2021 1 commit

Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198) · 7a3a4502

Yuting Jiang authored Sep 13, 2021

**Description**
Add barrier before 'destroy_process_group' to resolve the bug due to when multi models in one model benchmark, some processes haven't finished the previous process group while others failed to initialize new process group for the next model on rocm4.x when running bert_models.

**Major Revision**
-  Add barrier before 'destroy_process_group'.

7a3a4502

03 Sep, 2021 1 commit

Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode... · 60762518

Yuting Jiang authored Sep 03, 2021

Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode and rename metric (#189)

**Description**
Revise arguments of nccl/rccl to support mpi mode for (mpi can not run in nccl/rccl due to multiple operators run in sequence without barrier) and rename metric .

**Major Revision**
- revise argument operators to be a single one

**Minor Revision**
- rename metric to remove benchmark name info
- change argument ngpus default value to be 1

60762518

02 Sep, 2021 1 commit
- Benchmarks: Fix bug - Fix missing key error in disk performance benchmark (#188) · b79e2845
  Ziyue Yang authored Sep 02, 2021
```
**Description**
This commit fixes error of missing key 'percentile' in parsing FIO result.
```
  b79e2845
01 Sep, 2021 1 commit

Benchmarks: Code Revision - revise the DockerBenchmark base class (#179) · 37d5dfd5

guoshzhao authored Sep 01, 2021

**Description**
Revise the DockerBenchmark base to support image pull, image rm etc.

**Major Revision**
- image pull in _preprocess()
- image clean in _postprocess()
- execute customized commands in _benchmark()
- add unit tests

37d5dfd5

31 Aug, 2021 2 commits

Benchmarks: Code Revision - Revise metric name generation and default config... · 024a870b

Ziyue Yang authored Aug 31, 2021

Benchmarks: Code Revision - Revise metric name generation and default config for disk performance benchmark (#175)

**Description**
This commit revises disk performance benchmark, including:
1) Add missing benchmark name in default config;
2) Avoid using reserved character ':' in metric name.

024a870b

Benchmarks: Code Revision - Revise subprocess invoke (#178) · 8cd264fd
guoshzhao authored Aug 31, 2021
```
**Description**
Package frequently-used subprocess invoke into function.
```
8cd264fd

30 Aug, 2021 4 commits

Benchmarks: Add Benchmark - Add GPU SM copy benchmark (#169) · b97197f0
Ziyue Yang authored Aug 30, 2021
```
**Description**
This commit adds gpu_sm_copy benchmark and related tests.
```
b97197f0

Benchmarks: Fix Bug - Remove ib device port info in command to fix bug of ib loopback (#173) · 95c9fc95

Yuting Jiang authored Aug 30, 2021

**Description**
Remove IB device port info in command to fix bug of IB loopback.

**Major Revision**
- Remove IB device port info in command to fix bug of IB loopback

95c9fc95

Benchmarks: Add Benchmark - Add gemm flops microbenchmark for amd (#152) · f3d53c3d

Yuting Jiang authored Aug 30, 2021

**Description**
Add gemm flops microbenchmark for amd.

**Major Revision**
- Add gemm flops microbenchmark for amd.
- Add related example and test file.

f3d53c3d

Benchmarks: Code Revision - Extract base class for gemm flops microbenchmark (#165) · b0df66f7

Yuting Jiang authored Aug 30, 2021

**Description**
Extract base class for gemm flops microbenchmark.

**Major Revision**
- extract base class for gemm flops microbenchmark and add related test.
- revise gemm_flops_performance for cuda.

b0df66f7

27 Aug, 2021 4 commits

Benchmarks: Code Revision - Rename kernel_launch_overhead metrics (#171) · 35114bae

guoshzhao authored Aug 28, 2021

**Description**
Rename `kernel_launch_overhead_event` to `event_overhead`, `kernel_launch_overhead_wall` to `wall_overhead`.

35114bae

Benchmarks: Add Benchmark - Add memory bus bandwidth performance microbenchmark for amd (#153) · 666e3a94

Yuting Jiang authored Aug 27, 2021

**Description**
Add memory bus bandwidth performance microbenchmark for amd.

**Major Revision**
- Add memory bus bandwidth performance microbenchmark for amd.
- Add related example and test file.

666e3a94

Benchmarks: Add Benchmark - Add GPU SM copy benchmark (#162) · 2880f71e
Ziyue Yang authored Aug 27, 2021
```
**Description**
This commit adds the benchmark program for GPU-initiated data transfer benchmark.
```
2880f71e

Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for... · 958ebc0e

Yuting Jiang authored Aug 27, 2021

Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for amd in build pipeline (#166)

**Description**
Fix bug of microbenmark building cublas and cudnn for amd

**Major Revision**
- remove cuda LANGUAGES in project()
- check CUDAToolkit quiet and then build if found

958ebc0e

26 Aug, 2021 1 commit

Benchmarks: Code Revision - Rename computation_communication_overlap microbenchmark metric (#167) · 34cd2e8c

Yuting Jiang authored Aug 26, 2021

**Description**
Rename computation_communication_overlap microbenchmark metric .

**Major Revision**
- remove rank info in metric.
- simplify and rename metric.

34cd2e8c

25 Aug, 2021 1 commit

Benchmarks: Code Revision - Extract base class for memory bandwidth microbenchmark (#159) · e5e84a2e

Yuting Jiang authored Aug 26, 2021

**Description**
extract base class for memory bandwidth microbenchmark.

**Major Revision**
- revise and optimize cuda_memory_bandwidth_performance
- extract base class for memory bandwidth microbenchmark
- add test for base class

e5e84a2e

22 Aug, 2021 1 commit
- Benchmarks: Revise Benchmark - Add readwrite I/O pattern (#161) · 6774d7b7
  Ziyue Yang authored Aug 22, 2021
```
**Description**
This commit adds readwrite I/O pattern for FIO benchmark. Read/write ratio is fixed at 4:1.
```
  6774d7b7
20 Aug, 2021 1 commit

Runner: Add Feature - Generate summarized output files. (#157) · 7595d794

guoshzhao authored Aug 20, 2021

**Description**
Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`

**Major Revision**
- Generate the summarized json file per node:
For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
`[]` means optional.
```
{
  "kernel-launch/overhead_event:0": 0.00583,
  "kernel-launch/overhead_event:1": 0.00545,
  "kernel-launch/overhead_event:2": 0.00581,
  "kernel-launch/overhead_event:3": 0.00572,
  "kernel-launch/overhead_event:4": 0.00559,
  "kernel-launch/overhead_event:5": 0.00591,
  "kernel-launch/overhead_event:6": 0.00562,
  "kernel-launch/overhead_event:7": 0.00586,
  "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
  "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
  "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
  "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
  "pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
  "pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
  "pytorch-sharding-matmul/0/allgather": 10.088025093078613,
  "pytorch-sharding-matmul/1/allgather": 10.088025093078613
}
```
- Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.

7595d794

16 Aug, 2021 1 commit
- Benchmarks: Code Revision - change 'reduce' to 'reduce_op' (#156) · 7293e783
  guoshzhao authored Aug 16, 2021
```
**Description**
Change the field name `reduce` to `reduce_op`.
```
  7293e783
06 Aug, 2021 2 commits
- Benchmarks: Add Feature - Set reduce type for current benchmarks' metrics. (#149) · acf365a8
  guoshzhao authored Aug 06, 2021
```
**Description**
Set reduce type for current benchmarks' metrics, including model benchmarks and ShardingMatmul.
```
  acf365a8
- Benchmarks: Code Revision - Calculate average value by using statistics module. (#148) · bc1a61b9
  guoshzhao authored Aug 06, 2021
```
**Description**
Replace `sum(results) / len(results)` with `statistics.mean(results)`
```
  bc1a61b9
05 Aug, 2021 1 commit

Benchmarks: Add Feature - Add reduce function support for output summary. (#147) · e41b1f62

guoshzhao authored Aug 05, 2021

**Description**
Add reduce function support for output summary.

**Major Revision**
- Add reducer class to maintain all reduce functions.
- Save reduce type of each metric into `BenchmarkResult`
- Fix UT.

e41b1f62

30 Jul, 2021 1 commit
- Benchmarks: Add Benchmark - Revise and add rccl microbenchmark for rocm (#143) · 157b4e2d
  Yuting Jiang authored Jul 30, 2021
```
**Description**
Add rccl bandwidth microbenchmark for rocm.

**Major Revision**
- Register rccl-bw benchmark.
```
  157b4e2d
29 Jul, 2021 1 commit

Release - SuperBench v0.2.1 (#142) · 69b2c631

Yifan Xiong authored Jul 29, 2021

__Description__
Cherry-pick bug fixes from v0.2.1 to main.

__Major Revisions__
* Fix bug of VGG models failed on A100 GPU with batch_size=128.
* Fix Ansible connection issue when running in localhost.
* Update version in packages and docs.

69b2c631

27 Jul, 2021 2 commits

Benchmarks: Add Benchmark - Add the source code of rocm kernel launch overhead benchmark. (#136) · 1ee8f7dc

Yuting Jiang authored Jul 27, 2021

**Description**
Add the source code of rocm kernel launch overhead benchmark. 

**Major Revision**
- Revise cmake build logic to support both cuda and rocm

1ee8f7dc

Benchmarks: Build Pipeline - Support rocm cmake build (#137) · fdc33f40

Yuting Jiang authored Jul 27, 2021

**Description**
Support rocm cmake build. 

**Major Revision**
- Add  some envs in rocm_common.cmake to support rocm cmake build.

fdc33f40

26 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add NCCL performance benchmark (#113) · e083a598

Yuting Jiang authored Jul 26, 2021

**Description**
Add NCCL performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for NCCL

e083a598

23 Jul, 2021 2 commits

Benchmarks: Add Benchmark - Add IB Loopback performance benchmark. (#112) · b0c5addc

Yuting Jiang authored Jul 24, 2021

**Description**
Add RDMA Loopback performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for RDMA Loopback

b0c5addc

Benchmarks: Add Benchmark - Add disk performance benchmark (#132) · db297fb4

Ziyue Yang authored Jul 23, 2021

**Description**
Add disk performance microbenchmark.

**Major Revision**
- Add microbenchmark, example, test, config for disk performance.

**Minor Revision**
- Fix bugs in executor unit test related to default enabled tests.

db297fb4

13 Jul, 2021 1 commit

Benchmarks: Add Benchmark - Add memory bandwidth benchmark for cuda. (#114) · f9550bd6

Yuting Jiang authored Jul 13, 2021

Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile

f9550bd6

30 Jun, 2021 1 commit
- Benchmarks: Fix Bug - Fix typo in gemm-flops benchmark. (#109) · 1e96c27e
  guoshzhao authored Jun 30, 2021
  
  1e96c27e
29 Jun, 2021 1 commit
- Benchmarks: Fix Bug - Fix gemm kernel bug for nvidia v100. (#105) · 8ffaddfa
  guoshzhao authored Jun 29, 2021
```
* fix bug for nvidia v100
* hard code the supported dict for different arch.
```
  8ffaddfa
28 Jun, 2021 2 commits
- Benchmarks: Add Configuration - Add validation config file for azure NDv4. (#103) · f22bb3f2
  guoshzhao authored Jun 28, 2021
```
* add config file for ndv4.
```
  f22bb3f2
- Benchmarks: Code Revision - Replace torch.optim.AdamW with transformers.AdamW. (#106) · 9c748527
  guoshzhao authored Jun 28, 2021
```
* replace torch.optim.AdamW with transformers.AdamW.
```
  9c748527
21 Jun, 2021 1 commit
- Benchmarks: Add Feature - Add DistributedImpl and DistributedBackend arguments... · 216c5b5c
  guoshzhao authored Jun 21, 2021
```
Benchmarks: Add Feature - Add DistributedImpl and DistributedBackend arguments for micro benchmark. (#100)
```
  216c5b5c
20 Jun, 2021 1 commit
- Bug bash - Rename bin name and metric name of cublas and cudnn microbenchmark (#99) · 3d72c078
  Yuting Jiang authored Jun 20, 2021
```
rename bin name and result metric of cublas and cudnn microbenchmark
```
  3d72c078
16 Jun, 2021 1 commit

Bug bash - Fix bugs and refine log in single GPU benchmarks (#97) · ddbc51a1

Yifan Xiong authored Jun 16, 2021

Fix bugs and refine log in single GPU benchmarks:

* Fix none framework issue
* Fix empty parameter bug
* Remove missed mobilenet_v3 models
* Change benchmark registration log to debug level
* Add pid in logging
* Add missing benchmarks in default config
* Fix deprecated logging warn

ddbc51a1

07 Jun, 2021 1 commit
- Benchmarks: Fix Bug - Fix OOM issue when run pytorch models sequentially. (#93) · 03b41be1
  guoshzhao authored Jun 07, 2021
```
* Clean up the cache.
```
  03b41be1
04 Jun, 2021 1 commit
- Benchmarks: Fix Bug - Fix return code overwrite issue (#94) · 2d9be807
  guoshzhao authored Jun 04, 2021
```
* fix return code reset issue
```
  2d9be807
02 Jun, 2021 1 commit
- Benchmarks: Code Revision - Change default shape of sharding-matmul. (#92) · 44c5103b
  guoshzhao authored Jun 02, 2021
```
* Change default shape of sharding-matmul.
```
  44c5103b