1. 30 Nov, 2021 1 commit
  2. 29 Nov, 2021 1 commit
  3. 26 Nov, 2021 1 commit
  4. 25 Nov, 2021 1 commit
  5. 18 Nov, 2021 1 commit
  6. 15 Nov, 2021 1 commit
    • guoshzhao's avatar
      Benchmarks: Add Feature - Extend the device manager utility to support more functions. (#239) · cc70f9c1
      guoshzhao authored
      **Description**
      Rename `nvidia_helper` utility as `device_manager` module and support more functions:
      ```
      device_manager.get_device_count()
      device_manager.get_device_utilization(idx)
      device_manager.get_device_temperature(idx)
      device_manager.get_device_power_limit(idx)
      device_manager.get_device_memory(idx)
      device_manager.get_device_row_remapped_info(idx)
      device_manager.get_device_ecc_error(idx)
      ```
      cc70f9c1
  7. 12 Nov, 2021 1 commit
  8. 10 Nov, 2021 1 commit
  9. 09 Nov, 2021 2 commits
  10. 30 Oct, 2021 1 commit
  11. 29 Oct, 2021 1 commit
  12. 27 Oct, 2021 2 commits
  13. 22 Oct, 2021 2 commits
  14. 21 Oct, 2021 4 commits
  15. 12 Oct, 2021 3 commits
  16. 11 Oct, 2021 2 commits
  17. 28 Sep, 2021 1 commit
  18. 27 Sep, 2021 1 commit
  19. 26 Sep, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.3.0 (#212) · dfbd70b1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.3.0 to main.
      
      **Major Revisions**
      * Docs - Upgrade version and release note (#209)
      * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
      * Benchmarks: Update - Update benchmarks in configuration file (#208)
      * CI/CD - Update GitHub Action VM (#211)
      * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
      * CI/CD - Fix bug in build image for push event (#205)
      * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
      * Tool: Fix bug - Fix function naming issue in system info  (#200)
      * CI/CD - Push images in GitHub Action (#202)
      * Bug - Fix torch.distributed command for single node (#201)
      * CLI - Integrate system info for node (#199)
      * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
      * CI/CD - Add ROCm image build in GitHub Actions (#194)
      * Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
      * Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
      * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
      * Bug - Revise 'docker run' in sb deploy (#195)
      * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
      Co-authored-by: default avatarYuting Jiang <v-yujiang@microsoft.com>
      Co-authored-by: default avatarGuoshuai Zhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      dfbd70b1
  20. 06 Sep, 2021 1 commit
  21. 03 Sep, 2021 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode... · 60762518
      Yuting Jiang authored
      Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode and rename metric (#189)
      
      **Description**
      Revise arguments of nccl/rccl to support mpi mode for (mpi can not run in nccl/rccl due to multiple operators run in sequence without barrier) and rename metric .
      
      **Major Revision**
      - revise argument operators to be a single one
      
      **Minor Revision**
      - rename metric to remove benchmark name info
      - change argument ngpus default value to be 1
      60762518
  22. 02 Sep, 2021 6 commits
  23. 01 Sep, 2021 3 commits
  24. 31 Aug, 2021 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Build Pipeline - Support rocblas building in... · b90b47f3
      Yuting Jiang authored
      Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172)
      
      **Description**
      Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker.
      
      **Major Revision**
      - add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used
      
      **Minor Revision**
      - make rocm_version to be able to modify
      b90b47f3