- 01 Sep, 2021 1 commit
-
-
guoshzhao authored
**Description** Setup docker environment in docker container. **Major Revision** - Install docker client for cuda and rocm images. - Mount /var/run/docker.sock from host
-
- 31 Aug, 2021 5 commits
-
-
Yuting Jiang authored
Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172) **Description** Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker. **Major Revision** - add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used **Minor Revision** - make rocm_version to be able to modify
-
Ziyue Yang authored
Benchmarks: Code Revision - Revise metric name generation and default config for disk performance benchmark (#175) **Description** This commit revises disk performance benchmark, including: 1) Add missing benchmark name in default config; 2) Avoid using reserved character ':' in metric name.
-
guoshzhao authored
**Description** Add dockerfile `rocm4.0-pytorch1.7.0.dockerfile` and `rocm4.2-pytorch1.7.0.dockerfile` for `rocm` platform.
-
guoshzhao authored
**Description** change the minimal version requirement for superbench: ``` 'torch>=1.7.0a0', 'torchvision>=0.8.0a0', ```
-
guoshzhao authored
**Description** Package frequently-used subprocess invoke into function.
-
- 30 Aug, 2021 6 commits
-
-
Ziyue Yang authored
**Description** This commit adds gpu_sm_copy benchmark and related tests.
-
TobeyQin authored
**Description** Revise results contributing rule. - Change the results uploading path to [superbench-results](https://github.com/microsoft/superbench-results ) repo. - Add description of how to get system info by command. Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com>
-
Yifan Xiong authored
**Description** Add document for SuperBench YAML config file.
-
Yuting Jiang authored
**Description** Remove IB device port info in command to fix bug of IB loopback. **Major Revision** - Remove IB device port info in command to fix bug of IB loopback
-
Yuting Jiang authored
**Description** Add gemm flops microbenchmark for amd. **Major Revision** - Add gemm flops microbenchmark for amd. - Add related example and test file.
-
Yuting Jiang authored
**Description** Extract base class for gemm flops microbenchmark. **Major Revision** - extract base class for gemm flops microbenchmark and add related test. - revise gemm_flops_performance for cuda.
-
- 27 Aug, 2021 4 commits
-
-
guoshzhao authored
**Description** Rename `kernel_launch_overhead_event` to `event_overhead`, `kernel_launch_overhead_wall` to `wall_overhead`.
-
Yuting Jiang authored
**Description** Add memory bus bandwidth performance microbenchmark for amd. **Major Revision** - Add memory bus bandwidth performance microbenchmark for amd. - Add related example and test file.
-
Ziyue Yang authored
**Description** This commit adds the benchmark program for GPU-initiated data transfer benchmark.
-
Yuting Jiang authored
Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for amd in build pipeline (#166) **Description** Fix bug of microbenmark building cublas and cudnn for amd **Major Revision** - remove cuda LANGUAGES in project() - check CUDAToolkit quiet and then build if found
-
- 26 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Rename computation_communication_overlap microbenchmark metric . **Major Revision** - remove rank info in metric. - simplify and rename metric.
-
- 25 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** extract base class for memory bandwidth microbenchmark. **Major Revision** - revise and optimize cuda_memory_bandwidth_performance - extract base class for memory bandwidth microbenchmark - add test for base class
-
- 23 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** fix typo in test_nccl_bw_performance.py. **Major Revision** - fix typo in test_nccl_bw_performance.py.
-
- 22 Aug, 2021 1 commit
-
-
Ziyue Yang authored
**Description** This commit adds readwrite I/O pattern for FIO benchmark. Read/write ratio is fixed at 4:1.
-
- 20 Aug, 2021 2 commits
-
-
guoshzhao authored
**Description** Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op` **Major Revision** - Generate the summarized json file per node: For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]` For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}` `[]` means optional. ``` { "kernel-launch/overhead_event:0": 0.00583, "kernel-launch/overhead_event:1": 0.00545, "kernel-launch/overhead_event:2": 0.00581, "kernel-launch/overhead_event:3": 0.00572, "kernel-launch/overhead_event:4": 0.00559, "kernel-launch/overhead_event:5": 0.00591, "kernel-launch/overhead_event:6": 0.00562, "kernel-launch/overhead_event:7": 0.00586, "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134, "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773, "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677, "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973, "pytorch-sharding-matmul/0/allreduce": 10.561786651611328, "pytorch-sharding-matmul/1/allreduce": 10.561786651611328, "pytorch-sharding-matmul/0/allgather": 10.088025093078613, "pytorch-sharding-matmul/1/allgather": 10.088025093078613 } ``` - Generate the summarized jsonl file for all nodes, each line is the result from one node in json format. -
Yuting Jiang authored
**Description** Add build logic of hipBusBandwidth in third_party. **Major Revision** - Add build logic of hipBusBandwidth in third_party
-
- 19 Aug, 2021 1 commit
-
-
Yifan Xiong authored
Support mpi mode in runner: * concate mpirun command * support mca and env config * prepare hostfile and update Ansible host pattern Co-authored-by:Peng Cheng <chengpeng5555@outlook.com>
-
- 16 Aug, 2021 2 commits
-
-
Yifan Xiong authored
Add config and docs for development experience. __Major Revision__ - Add settings and extensions config for VSCode. - Add devcontainer config for Codespaces. - Update document accordingly.
-
guoshzhao authored
**Description** Change the field name `reduce` to `reduce_op`.
-
- 12 Aug, 2021 1 commit
-
-
Yifan Xiong authored
Add docs on: * Docker image tag list * Build image and run container instructions
-
- 09 Aug, 2021 1 commit
-
-
guoshzhao authored
Add ReduceType description into benchmarks doc.
-
- 06 Aug, 2021 2 commits
- 05 Aug, 2021 1 commit
-
-
guoshzhao authored
**Description** Add reduce function support for output summary. **Major Revision** - Add reducer class to maintain all reduce functions. - Save reduce type of each metric into `BenchmarkResult` - Fix UT.
-
- 02 Aug, 2021 2 commits
-
-
Yuting Jiang authored
**Description** Add rocBLAS building logic in third_party. **Major Revision** - Add rocm_rocblas target in third_party/Makefile. - Add rocblas building logic
-
TobeyQin authored
**Description** Add Executor and Benchmarks design doc **Major Revision** - Add Executor design doc - Add Benchmarks design doc
-
- 30 Jul, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Add rccl bandwidth microbenchmark for rocm. **Major Revision** - Register rccl-bw benchmark.
-
- 29 Jul, 2021 3 commits
-
-
Yuting Jiang authored
**Description** Support rocm in third_party/makefile and add rccl-tests as a submodule with building logic. **Major Revision** - Support rocm in third_party/makefile - Add rccl-tests as a submodule - Add build logic in third_party/Makefile for rccl-tests
-
Yifan Xiong authored
__Description__ Cherry-pick bug fixes from v0.2.1 to main. __Major Revisions__ * Fix bug of VGG models failed on A100 GPU with batch_size=128. * Fix Ansible connection issue when running in localhost. * Update version in packages and docs.
-
Yuting Jiang authored
**Description** Support rocm in third_party/makefile. **Major Revision** - Split rocm and cuda target in makefile - Add target in dockerfile
-
- 27 Jul, 2021 2 commits
-
-
Yuting Jiang authored
**Description** Add the source code of rocm kernel launch overhead benchmark. **Major Revision** - Revise cmake build logic to support both cuda and rocm
-
Yuting Jiang authored
**Description** Support rocm cmake build. **Major Revision** - Add some envs in rocm_common.cmake to support rocm cmake build.
-
- 26 Jul, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Add NCCL performance microbenchmark. **Major Revision** - Add microbenchmark, example, test, config for NCCL
-
- 23 Jul, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Add RDMA Loopback performance microbenchmark. **Major Revision** - Add microbenchmark, example, test, config for RDMA Loopback
-