- 16 Sep, 2021 1 commit
-
-
Yifan Xiong authored
Integrate system info for node, add `sb node info` command.
-
- 14 Sep, 2021 1 commit
-
-
guoshzhao authored
**Description** 1. Do `enable_language(CUDA)` before using `CMAKE_CUDA_COMPILER_VERSION` 2. use `cmake --install` to install target which will call `cmake -P cmake_install.cmake` instead of `make Makefile` to avoid issue `make: *** No rule to make target 'install'. Stop.`
-
- 13 Sep, 2021 5 commits
-
-
Yifan Xiong authored
Add ROCm image build in GitHub Actions.
-
Yuting Jiang authored
**Description** fix bug of hipBusBandwidth building **Major Revision** - it failed to enter the check 'hip/samples/1_Utils/hipBusBandwidth/CMakeLists.txt' when building docker, so removed this check - add sb_micro_path for rocm_bandwidthTest
-
Yuting Jiang authored
**Description** restore rocblas build logic to cancel support of rocblas build in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 base image. **Major Revision** - restore rocblas build logic, remove gpu target limit and other resource limit for rocm4.0.
-
Yuting Jiang authored
**Description** Add barrier before 'destroy_process_group' to resolve the bug due to when multi models in one model benchmark, some processes haven't finished the previous process group while others failed to initialize new process group for the next model on rocm4.x when running bert_models. **Major Revision** - Add barrier before 'destroy_process_group'.
-
Yuting Jiang authored
**Description** Revise 'docker run' in sb deploy due to base image running endpoint/cmd under /root. **Major Revision** - define endpoint bash when 'docker run'
-
- 09 Sep, 2021 1 commit
-
-
Yuting Jiang authored
**Description** fix bug of error param opterations of rccl-bw in hpe MI100 config **Major Revision** - operations->operation
-
- 06 Sep, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Add script to generate system config info. **Major Revision** - Add script to generate system config info into the dict in superbench/tools.
-
- 03 Sep, 2021 1 commit
-
-
Yuting Jiang authored
Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode and rename metric (#189) **Description** Revise arguments of nccl/rccl to support mpi mode for (mpi can not run in nccl/rccl due to multiple operators run in sequence without barrier) and rename metric . **Major Revision** - revise argument operators to be a single one **Minor Revision** - rename metric to remove benchmark name info - change argument ngpus default value to be 1
-
- 02 Sep, 2021 6 commits
-
-
Yifan Xiong authored
__Description__ Resolve "too many open files" issue when runnning NCCL/RCCL on multiple nodes using Docker images, increase nofile number in limits.conf.
-
Ziyue Yang authored
**Description** This commit fixes error of missing key 'percentile' in parsing FIO result.
-
Yuting Jiang authored
Benchmarks: Add Configuration - Add microbenchmark in the validation config file for HPE (AMD MI00) (#176) **Description** Add microbenchmark in the validation config file for AMD MI00. **Major Revision** - add rccl-bw, mem-bw,ib-loopback,gemm-flops,kernel-launch config for mi100
-
Yifan Xiong authored
Support docsearch in website, powered by [Algolia](https://docsearch.algolia.com).
-
Yifan Xiong authored
__Description__ Fix inventory bug in ansible_runner when host list is provided with multiple hosts. It ought to be handled by ansible_runner lib, workaround by using `--inventory` arg in cmdline.
-
TobeyQin authored
**Description** Add system config info for result collection
-
- 01 Sep, 2021 3 commits
-
-
guoshzhao authored
**Description** Revise the DockerBenchmark base to support image pull, image rm etc. **Major Revision** - image pull in _preprocess() - image clean in _postprocess() - execute customized commands in _benchmark() - add unit tests
-
guoshzhao authored
**Description** Install openmpi-4.0.0 for ROCm images.
-
guoshzhao authored
**Description** Setup docker environment in docker container. **Major Revision** - Install docker client for cuda and rocm images. - Mount /var/run/docker.sock from host
-
- 31 Aug, 2021 5 commits
-
-
Yuting Jiang authored
Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172) **Description** Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker. **Major Revision** - add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used **Minor Revision** - make rocm_version to be able to modify
-
Ziyue Yang authored
Benchmarks: Code Revision - Revise metric name generation and default config for disk performance benchmark (#175) **Description** This commit revises disk performance benchmark, including: 1) Add missing benchmark name in default config; 2) Avoid using reserved character ':' in metric name.
-
guoshzhao authored
**Description** Add dockerfile `rocm4.0-pytorch1.7.0.dockerfile` and `rocm4.2-pytorch1.7.0.dockerfile` for `rocm` platform.
-
guoshzhao authored
**Description** change the minimal version requirement for superbench: ``` 'torch>=1.7.0a0', 'torchvision>=0.8.0a0', ```
-
guoshzhao authored
**Description** Package frequently-used subprocess invoke into function.
-
- 30 Aug, 2021 6 commits
-
-
Ziyue Yang authored
**Description** This commit adds gpu_sm_copy benchmark and related tests.
-
TobeyQin authored
**Description** Revise results contributing rule. - Change the results uploading path to [superbench-results](https://github.com/microsoft/superbench-results ) repo. - Add description of how to get system info by command. Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com>
-
Yifan Xiong authored
**Description** Add document for SuperBench YAML config file.
-
Yuting Jiang authored
**Description** Remove IB device port info in command to fix bug of IB loopback. **Major Revision** - Remove IB device port info in command to fix bug of IB loopback
-
Yuting Jiang authored
**Description** Add gemm flops microbenchmark for amd. **Major Revision** - Add gemm flops microbenchmark for amd. - Add related example and test file.
-
Yuting Jiang authored
**Description** Extract base class for gemm flops microbenchmark. **Major Revision** - extract base class for gemm flops microbenchmark and add related test. - revise gemm_flops_performance for cuda.
-
- 27 Aug, 2021 4 commits
-
-
guoshzhao authored
**Description** Rename `kernel_launch_overhead_event` to `event_overhead`, `kernel_launch_overhead_wall` to `wall_overhead`.
-
Yuting Jiang authored
**Description** Add memory bus bandwidth performance microbenchmark for amd. **Major Revision** - Add memory bus bandwidth performance microbenchmark for amd. - Add related example and test file.
-
Ziyue Yang authored
**Description** This commit adds the benchmark program for GPU-initiated data transfer benchmark.
-
Yuting Jiang authored
Benchmarks: Fix Bug - fix bug of microbenmark building cublas and cudnn for amd in build pipeline (#166) **Description** Fix bug of microbenmark building cublas and cudnn for amd **Major Revision** - remove cuda LANGUAGES in project() - check CUDAToolkit quiet and then build if found
-
- 26 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Rename computation_communication_overlap microbenchmark metric . **Major Revision** - remove rank info in metric. - simplify and rename metric.
-
- 25 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** extract base class for memory bandwidth microbenchmark. **Major Revision** - revise and optimize cuda_memory_bandwidth_performance - extract base class for memory bandwidth microbenchmark - add test for base class
-
- 23 Aug, 2021 1 commit
-
-
Yuting Jiang authored
**Description** fix typo in test_nccl_bw_performance.py. **Major Revision** - fix typo in test_nccl_bw_performance.py.
-
- 22 Aug, 2021 1 commit
-
-
Ziyue Yang authored
**Description** This commit adds readwrite I/O pattern for FIO benchmark. Read/write ratio is fixed at 4:1.
-
- 20 Aug, 2021 2 commits
-
-
guoshzhao authored
**Description** Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op` **Major Revision** - Generate the summarized json file per node: For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]` For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}` `[]` means optional. ``` { "kernel-launch/overhead_event:0": 0.00583, "kernel-launch/overhead_event:1": 0.00545, "kernel-launch/overhead_event:2": 0.00581, "kernel-launch/overhead_event:3": 0.00572, "kernel-launch/overhead_event:4": 0.00559, "kernel-launch/overhead_event:5": 0.00591, "kernel-launch/overhead_event:6": 0.00562, "kernel-launch/overhead_event:7": 0.00586, "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134, "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773, "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677, "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973, "pytorch-sharding-matmul/0/allreduce": 10.561786651611328, "pytorch-sharding-matmul/1/allreduce": 10.561786651611328, "pytorch-sharding-matmul/0/allgather": 10.088025093078613, "pytorch-sharding-matmul/1/allgather": 10.088025093078613 } ``` - Generate the summarized jsonl file for all nodes, each line is the result from one node in json format. -
Yuting Jiang authored
**Description** Add build logic of hipBusBandwidth in third_party. **Major Revision** - Add build logic of hipBusBandwidth in third_party
-