1. 27 Mar, 2026 1 commit
  2. 19 Mar, 2026 1 commit
  3. 17 Mar, 2026 1 commit
  4. 01 Oct, 2025 1 commit
  5. 30 Sep, 2025 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Micro benchmark - Add simultanneously all-to-host / host-to-all... · 93e9d262
      Yuting Jiang authored
      Benchmarks: Micro benchmark - Add simultanneously all-to-host / host-to-all bandwidth testcases to nvbandwidth (#736)
      
      **Description**
      Add simultanneously all-to-host / host-to-all bandwidth testcases to
      nvbandwidth .
      
      **Major Revision**
      - nvbandwidth.patch: Add simultanneously all-to-host / host-to-all
      bandwidth testcases to nvbandwidth
      - upgrade nvbandwidth submodule into v0.8
      - add patch into makefile build
      93e9d262
  6. 26 Jun, 2025 1 commit
  7. 25 Jun, 2025 1 commit
  8. 14 Jun, 2025 1 commit
    • Hongtao Zhang's avatar
      microbenchmark - CPU Stream Benchmark Revise (#712) · 991c0051
      Hongtao Zhang authored
      
      
      In the current implementation, the CPU‑stream benchmark code renames the
      binary before the microbench base class can verify its existence,
      causing the default‐binary check to fail.
      
      This PR adds a “default” binary—built with the standard compile
      parameters—so that the base class can always find and validate it. Once
      the default binary is in place, the CPU‑stream code will rename it as
      needed and re‑check its presence before running the benchmark.
      
      The PR also enable CPU stream in the default settings.
      
      ---------
      Co-authored-by: default avatarHongtao Zhang <hongtaozhang@microsoft.com>
      991c0051
  9. 21 Mar, 2025 1 commit
  10. 21 Nov, 2024 1 commit
  11. 06 Nov, 2024 1 commit
    • pdr's avatar
      Dockerfile - Add support for arm64 build (#660) · 47949127
      pdr authored
      Add support for arm64 build:
      
      - Updated dockerfile for arm64 build
      - extend cpu stream compilation for neoverse 
      - handle onnxruntime-gpu installation
      - third party builds filtering based on arch
      - disable cuda decode perf build for non x86
      47949127
  12. 28 Jul, 2024 1 commit
  13. 26 Jul, 2024 1 commit
  14. 08 Jan, 2024 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.10.0 (#607) · 2c88db90
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.10.0 to main.
      
      **Major Revisions**
      
      * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
      * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
      * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
      * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
      * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
      * CI/CD - Add ndv5 topo file #597
      * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
      * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
      * Dockerfile - Bug fix for rocm docker build and deploy #598
      * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
      * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
      * Monitor - Upgrade pyrsmi to amdsmi python library. #601
      * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
      * Dockerfile - Add rocm6.0 dockerfile #602
      * Bug Fix - Bug fix for latest megatron-lm benchmark #600
      * Docs - Upgrade version and release note #606
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYang Wang <yangwang1@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      2c88db90
  15. 09 Dec, 2023 1 commit
  16. 07 Dec, 2023 2 commits
  17. 22 Nov, 2023 1 commit
  18. 27 Jul, 2023 1 commit
    • Yuting Jiang's avatar
      Release - SuperBench v0.9.0 (#558) · e1df877b
      Yuting Jiang authored
      **Description**
      Cherry-pick bug fixes from v0.9.0 to main.
      
      **Major Revision**
      - CI/CD: pipeline - clean more disk space to fix rocm building image
      pipeline(#555 )
      - Benchmarks: bug fix - use absolute path for input file in
      DirectXEncodingLatency(#554)
      - CI/CD - add push win docker image on release branch in pipeline (#552)
      - Docs - Upgrade version and release note(#557)
      e1df877b
  19. 03 Jul, 2023 1 commit
  20. 21 Mar, 2023 1 commit
  21. 24 Feb, 2023 1 commit
  22. 13 Feb, 2023 1 commit
  23. 29 Dec, 2022 1 commit
  24. 06 Jul, 2022 1 commit
    • Yifan Xiong's avatar
      Update dependencies and Dockerfile (#371) · 9f03d568
      Yifan Xiong authored
      Update dependencies and Dockerfile:
      * upgrade nccl-tests and rccl-tests to current latest version to match
        NCCL/RCCL versions
      * unify image tag names on DockerHub
      * remove verbose output in Dockerfile and minor fix some flags
      9f03d568
  25. 19 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Update ROCm Dockerfile (#361) · 483bf782
      Yifan Xiong authored
      **Description**
      
      Update ROCm Dockerfile.
      
      **Major Revisions**
      - Add dockerfile for ROCm 5.1.3
      - Merge 5.1.x and 5.0.x dockerfile
      - Remove 4.2 and 4.0 legacy
      - Update build pipeline accordingly
      483bf782
  26. 15 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Fix cmake and build issues (#360) · 60a3c743
      Yifan Xiong authored
      **Description**
      
      Fix cmake and build issues.
      
      **Major Revision**
      
      * Remove unnecessary boost build
      * Remove user-agent for mlc
      * Remove -j for third party to build each project in sequence
      * Fix ansible collections installation path
      60a3c743
  27. 16 Mar, 2022 1 commit
    • rafsalas19's avatar
      Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324) · ff51a3ce
      rafsalas19 authored
      **Description**
      Modifications adding GPU-Burn to SuperBench.
      - added third party submodule
      - modified Makefile to make gpu-burn binary
      - added/modified microbenchmarks to add gpu-burn python scripts
      - modified default and azure_ndv4 configs to add gpu-burn
      ff51a3ce
  28. 24 Feb, 2022 1 commit
  29. 09 Feb, 2022 1 commit
  30. 29 Jan, 2022 1 commit
  31. 30 Dec, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.4.0 (#278) · ff563b66
      Yifan Xiong authored
      
      
      __Description__
      
      Cherry-pick  bug fixes from v0.4.0 to main.
      
      __Major Revisions__
      
      * Bug - Fix issues for Ansible and benchmarks (#267)
      * Tests - Refine test cases for microbenchmark (#268)
      * Bug - Build openmpi with ucx support in rocm dockerfiles (#269)
      * Benchmarks: Fix Bug - Fix fio build issue (#272)
      * Docs - Unify metric and add doc for cublas and cudnn functions (#271)
      * Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274)
      * Bug - Fix bug of detecting if gpu_index is none (#275)
      * Bug - Fix bugs in data diagnosis (#273)
      * Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270)
      * Benchmarks: Configuration - Update inference and network benchmarks in configs (#276)
      * Docs - Upgrade version and release note (#277)
      Co-authored-by: default avatarYuting Jiang <v-yutjiang@microsoft.com>
      ff563b66
  32. 01 Dec, 2021 1 commit
  33. 21 Oct, 2021 1 commit
  34. 26 Sep, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.3.0 (#212) · dfbd70b1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.3.0 to main.
      
      **Major Revisions**
      * Docs - Upgrade version and release note (#209)
      * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
      * Benchmarks: Update - Update benchmarks in configuration file (#208)
      * CI/CD - Update GitHub Action VM (#211)
      * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
      * CI/CD - Fix bug in build image for push event (#205)
      * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
      * Tool: Fix bug - Fix function naming issue in system info  (#200)
      * CI/CD - Push images in GitHub Action (#202)
      * Bug - Fix torch.distributed command for single node (#201)
      * CLI - Integrate system info for node (#199)
      * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
      * CI/CD - Add ROCm image build in GitHub Actions (#194)
      * Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
      * Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
      * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
      * Bug - Revise 'docker run' in sb deploy (#195)
      * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
      Co-authored-by: default avatarYuting Jiang <v-yujiang@microsoft.com>
      Co-authored-by: default avatarGuoshuai Zhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      dfbd70b1
  35. 31 Aug, 2021 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Build Pipeline - Support rocblas building in... · b90b47f3
      Yuting Jiang authored
      Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172)
      
      **Description**
      Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker.
      
      **Major Revision**
      - add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used
      
      **Minor Revision**
      - make rocm_version to be able to modify
      b90b47f3
  36. 20 Aug, 2021 1 commit
  37. 02 Aug, 2021 1 commit
  38. 29 Jul, 2021 2 commits