- 14 Apr, 2023 1 commit
-
-
Yifan Xiong authored
__Description__ Upgrade version and release note. __Major Revision__ - Upgrade package versions - Add release note for v0.8.0
-
- 13 Apr, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Fix wrong unit of cpu-memory-bw-latency in doc.
-
- 12 Apr, 2023 3 commits
-
-
Yifan Xiong authored
Update cuda11.8 image to cuda12.1 based on nvcr23.03 and related versions in the image: * cuda 11.8 -> 12.1 * nccl 2.15.5 -> 2.17.1 * hpcx: 2.8 -> 2.14 * mlc: 3.9a -> 3.10
-
Yifan Xiong authored
Remove unreachable condition when write host list in mpi mode.
-
Yifan Xiong authored
Change num_workers to configurable in model benchmark data loader.
-
- 07 Apr, 2023 1 commit
-
-
guoshzhao authored
**Description** Collect realtime GPU power when benchmarking.
-
- 06 Apr, 2023 4 commits
-
-
Yuting Jiang authored
**Description** Fix bug to get metric from cmd when error happens(cudnn-function/_time:4)
-
Yuting Jiang authored
**Description** Analyzer: Fix bug in python3.8 due to pandas api change. **Major Revision** - force check numeric only in dataframe for analysis - dataframe.append -> pd.concat - pd.ExcelWriter.save() -> pd.ExcelWriter.close()
-
Ziyue Yang authored
**Description** This commit fixes wrong `torch.empty_like` usage and missing dtype and device argument in communication wrappers.
-
Yifan Xiong authored
Fix matrix size overflow issue when cast from int to size_t implicitly.
-
- 03 Apr, 2023 1 commit
-
-
guoshzhao authored
**Description** Looks `grep cgroup /proc/filesystems` doesn't work for NDv4 whose cgroup version is v1, but the result of this command got v2 for NDv4. Instead, checking the file existence to judge the cgroup version.
-
- 28 Mar, 2023 1 commit
-
-
Yifan Xiong authored
__Description__ Update TE FP8 model conversion. __Major Revisions__ * Add 16-byte alignment comment. * Fix TE layer parameters type.
-
- 25 Mar, 2023 1 commit
-
-
Yifan Xiong authored
Support Transformer Engine FP8 in existing PyTorch BERT/GPT2 models by converting linear/layernorm to TE layers.
-
- 24 Mar, 2023 1 commit
-
-
Ziyue Yang authored
**Description** This PR adds a micro-benchmark of distributed model inference workloads. **Major Revision** - Add a new micro-benchmark dist-inference. - Add corresponding example and unit tests. - Update configuration files to include this new micro-benchmark. - Update micro-benchmark README. --------- Co-authored-by:Peng Cheng <chengpeng5555@outlook.com>
-
- 22 Mar, 2023 2 commits
-
-
guoshzhao authored
**Description** Since ubuntu 22.04 will use cgroup V2 and the file structure changed. Modify the monitor to adapt to cgroup v1 and v2.
-
Yifan Xiong authored
Support batch and shape range with multiplication factors in cublaslt gemm benchmark.
-
- 21 Mar, 2023 2 commits
-
-
rafsalas19 authored
**Description** - Adding HPL benchmark --------- Co-authored-by:
Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com>
-
Yifan Xiong authored
Fix potential barrier timeout in init_process_group due to race condition of using the same port. Change to different ports when running multiple models sequentially in one process. For example, when running vgg11/13/16/19, will use port 29501~29504 respectively.
-
- 20 Mar, 2023 2 commits
-
-
Yuting Jiang authored
**Description** Support error tolerance in micro-benchmark for CuDNN function **Major Revision** - revise micro_base to support running the remaining commands run when one command failed in the microbenchmark - make error tolerance as true in cudnn functions
-
Yifan Xiong authored
Support FP64/TF32/FP16/BF16 in cublaslt (batch) GEMM.
-
- 17 Mar, 2023 1 commit
-
-
dependabot[bot] authored
Bumps [webpack](https://github.com/webpack/webpack) from 5.39.1 to 5.76.1. - [Release notes](https://github.com/webpack/webpack/releases ) - [Commits](webpack/webpack@v5.39.1...v5.76.1) --- updated-dependencies: - dependency-name: webpack dependency-type: indirect ... Signed-off-by:
dependabot[bot] <support@github.com>
-
- 06 Mar, 2023 2 commits
-
-
Yifan Xiong authored
Pin setuptools version to [v65.7.0](https://setuptools.pypa.io/en/latest/history.html#v65-7-0) to avoid breaking changes since v66.0.0.
-
Yifan Xiong authored
Limit ansible_runner version to less than 2.3.2 for Python3.6.
-
- 27 Feb, 2023 1 commit
-
-
Yuting Jiang authored
Benchmarks: Revision - Support flexible warmup and non-random data initialization in cublas-benchmark (#479) **Description** revise cublas-benchmark for flexible warmup and fill data with fixed number for perf test to improve the running efficiency. **Major Revision** - remove num_in_steps for warmup to support more flexible warmup setting for users - Add support to generate input with fixed number for perf test
-
- 24 Feb, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Add suppport to install cpu-only perftest in makefile. Co-authored-by:
Yuting Jiang <yuting.jiang@microsoft.com> Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com>
-
- 23 Feb, 2023 1 commit
-
-
Yifan Xiong authored
Free more disk space in GitHub Action VHD.
-
- 17 Feb, 2023 2 commits
-
-
Yuting Jiang authored
**Description** Upgrade networkx version to fix installation compatibility issue.
-
dependabot[bot] authored
Bumps [@sideway/formula](https://github.com/sideway/formula) from 3.0.0 to 3.0.1. - [Release notes](https://github.com/sideway/formula/releases ) - [Commits](hapijs/formula@v3.0.0...v3.0.1) --- updated-dependencies: - dependency-name: "@sideway/formula" dependency-type: indirect ... Signed-off-by:
dependabot[bot] <support@github.com>
-
- 16 Feb, 2023 1 commit
-
-
dependabot[bot] authored
Bumps [http-cache-semantics](https://github.com/kornelski/http-cache-semantics) from 4.1.0 to 4.1.1. - [Release notes](https://github.com/kornelski/http-cache-semantics/releases ) - [Commits](kornelski/http-cache-semantics@v4.1.0...v4.1.1) --- updated-dependencies: - dependency-name: http-cache-semantics dependency-type: indirect ... Signed-off-by:
dependabot[bot] <support@github.com>
-
- 13 Feb, 2023 2 commits
-
-
rafsalas19 authored
**Description** - Added stream benchmark - Added stream unit test - Added stream example - Modified docker files to build stream --------- Co-authored-by:
Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by:
Peng Cheng <chengpeng5555@outlook.com> Co-authored-by:
Yifan Xiong <xiongyf@yandex.com>
-
Yuting Jiang authored
**Description** Support SuperBench Executor running on Windows. **Major Revision** - Lazy import ansible related module
-
- 07 Feb, 2023 1 commit
-
- 30 Jan, 2023 1 commit
-
-
dependabot[bot] authored
Bumps [ua-parser-js](https://github.com/faisalman/ua-parser-js) from 0.7.28 to 0.7.33. - [Release notes](https://github.com/faisalman/ua-parser-js/releases) - [Changelog](https://github.com/faisalman/ua-parser-js/blob/master/changelog.md ) - [Commits](faisalman/ua-parser-js@0.7.28...0.7.33) --- updated-dependencies: - dependency-name: ua-parser-js dependency-type: indirect ... Signed-off-by:
dependabot[bot] <support@github.com>
-
- 28 Jan, 2023 1 commit
-
-
Yifan Xiong authored
**Description** Cherry-pick bug fixes from v0.7.0 to main. **Major Revisions** * Benchmarks - Fix missing include in FP8 benchmark (#460) * Fix bug in TE BERT model (#461) * Doc - Update benchmark doc (#465) * Bug: Fix bug for incorrect datatype judgement in cublas-function source code (#464) * Support `sb deploy` without pulling image (#466) * Docs - Upgrade version and release note (#467) Co-authored-by:
Russell J. Hewett <russell.j.hewett@gmail.com> Co-authored-by:
Yuting Jiang <yutingjiang@microsoft.com>
-
- 17 Jan, 2023 1 commit
-
-
Yuting Jiang authored
**Description** Fix bug for incorrect datatype judgement in cublas-function source code.
-
- 09 Jan, 2023 1 commit
-
-
dependabot[bot] authored
Bumps [json5](https://github.com/json5/json5) from 1.0.1 to 1.0.2. - [Release notes](https://github.com/json5/json5/releases) - [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md ) - [Commits](json5/json5@v1.0.1...v1.0.2) --- updated-dependencies: - dependency-name: json5 dependency-type: indirect ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 04 Jan, 2023 3 commits
-
-
Yang Wang authored
Support traffic patterns under the different devices in NCCL/RCCL test * change the metrics format if specified the pattern
-
Yang Wang authored
**Major Revision** - Add an option for pattern to generate mpi_pattern.txt file if specified the path. - In mpi pattern, serial_index and parallel_index will add in each benchmark as environment variables. **Minor Revision** - Fix typo
-
Yifan Xiong authored
Support FP8 in PyTorch BERT models: * add fp8 hybrid/e4m3/e5m2 in precision arguments * build BERT encoders with `te.TransformerLayer` to repalce `transformers.BertModel` * wrap forward steps with fp8 autocast
-
- 03 Jan, 2023 1 commit
-
-
Yang Wang authored
**Description** Support the following patterns in `mpi` mode: * `k-batch` * `topo-aware`
-