- 23 Apr, 2026 1 commit
-
-
one authored
Add gpu-hpl and gpu-hpl-mxp micro benchmarks backed by rocHPL and rocHPL-MxP. Implemented a shared GPU HPL base that: - Generates per-workload HPL dat files and parses the corresponding output files. - Supports common HPL inputs such as process grid, matrix size, block size, broadcast topology, warmup, iterations, and reduce operator. - Adds rocHPL-specific tuning parameters for gpu-hpl. - Formats metric keys from input-derived workload attributes. - Reports `flops`, `time`, and `tests_pass` metrics with warmup-aware aggregation. Add benchmark registrations, parser tests, sample output fixtures, documentation, and recommended configurations for gpu-hpl and gpu-hpl-mxp. Update rocHPL and rocHPL-MxP third-party integration with build patches, install targets, and SuperBench run helper scripts. Also update gpu-hpcg metric naming to use flops instead of gflops, remove standalone domain/verification-style metrics from the documented metric surface, and refresh Hygon HPCG documentation/config references accordingly.
-
- 27 Mar, 2026 1 commit
-
-
one authored
-
- 19 Mar, 2026 1 commit
-
-
one authored
- Update rocm_commom.cmake for CMake>=3.24 - Prevent isolation build - Add BabelStream as a submodule - Update dockerignore
-
- 01 Oct, 2025 1 commit
-
-
WenqingLan1 authored
Add support for cuda13.0. Add cuda13.0.dockerfile. Add cuda13.0 image building task to github pipeline. Update GPU STREAM to work with cuda13.0. Fix data type conversion perf bug in GPU stream. Update nvbandwidth submodule to be v0.8. Update perftest submodule to be 4bee61f80d9e268fc97eaf40be00409e91d3a19e (recent master). --------- Co-authored-by:
Ubuntu <dilipreddi@gmail.com> Co-authored-by:
guoshzhao <guzhao@microsoft.com>
-
- 21 Nov, 2024 1 commit
-
-
Hongtao Zhang authored
**Description** Add nvbandwidth build to repo --------- Co-authored-by:hongtaozhang <hongtaozhang@microsoft.com>
-
- 08 Jan, 2024 1 commit
-
-
Yifan Xiong authored
**Description** Cherry-pick bug fixes from v0.10.0 to main. **Major Revisions** * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590 * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591 * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592 * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595 * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596 * CI/CD - Add ndv5 topo file #597 * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593 * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599 * Dockerfile - Bug fix for rocm docker build and deploy #598 * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603 * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604 * Monitor - Upgrade pyrsmi to amdsmi python library. #601 * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605 * Dockerfile - Add rocm6.0 dockerfile #602 * Bug Fix - Bug fix for latest megatron-lm benchmark #600 * Docs - Upgrade version and release note #606 Co-authored-by:
Ziyue Yang <ziyyang@microsoft.com> Co-authored-by:
Yang Wang <yangwang1@microsoft.com> Co-authored-by:
Yuting Jiang <yutingjiang@microsoft.com> Co-authored-by:
guoshzhao <guzhao@microsoft.com>
-
- 07 Dec, 2023 1 commit
-
-
Ziyue Yang authored
**Description** Add MSCCL support for Nvidia GPU
-
- 16 Mar, 2022 1 commit
-
-
rafsalas19 authored
**Description** Modifications adding GPU-Burn to SuperBench. - added third party submodule - modified Makefile to make gpu-burn binary - added/modified microbenchmarks to add gpu-burn python scripts - modified default and azure_ndv4 configs to add gpu-burn
-
- 21 Oct, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Add gpcnet as git submodule and building logic. **Major Revision** - add gpcnet as a submodule - add build logic in third_party/Makefile
-
- 29 Jul, 2021 1 commit
-
-
Yuting Jiang authored
**Description** Support rocm in third_party/makefile and add rccl-tests as a submodule with building logic. **Major Revision** - Support rocm in third_party/makefile - Add rccl-tests as a submodule - Add build logic in third_party/Makefile for rccl-tests
-
- 19 Jul, 2021 1 commit
-
-
Ziyue Yang authored
**Description** Add FIO benchmark tool into third-party dependency. **Major Revision** - Add FIO submodule into third-party directory and modify Makefile to enable it.
-
- 16 Jul, 2021 2 commits
-
-
Yuting Jiang authored
Add perftest as a submodule and add build logic
-
Yuting Jiang authored
Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.
-
- 15 Jul, 2021 1 commit
-
-
Yuting Jiang authored
Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue when building docker (#126) * fix bug in docker build of third_party/cuda-samples
-
- 13 Jul, 2021 1 commit
-
-
Yuting Jiang authored
Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile
-
- 01 Jun, 2021 1 commit
-
-
guoshzhao authored
* add cutlass as submodule. * add build script for cutlass. * only support compute capability 7.0(V100) and 8.0(A100)
-