1. 22 Sep, 2024 1 commit
  2. 28 Jul, 2024 1 commit
  3. 22 Apr, 2024 1 commit
  4. 08 Jan, 2024 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.10.0 (#607) · 2c88db90
      Yifan Xiong authored
      **Description**
      
      Cherry-pick bug fixes from v0.10.0 to main.
      
      **Major Revisions**
      
      * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
      * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
      * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
      * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
      * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
      * CI/CD - Add ndv5 topo file #597
      * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
      * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
      * Dockerfile - Bug fix for rocm docker build and deploy #598
      * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
      * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
      * Monitor - U...
      2c88db90
  5. 09 Dec, 2023 1 commit
  6. 07 Dec, 2023 1 commit
  7. 22 Nov, 2023 1 commit
  8. 22 Aug, 2023 1 commit
  9. 18 Aug, 2023 1 commit
  10. 27 Jul, 2023 1 commit
    • Yuting Jiang's avatar
      Release - SuperBench v0.9.0 (#558) · e1df877b
      Yuting Jiang authored
      **Description**
      Cherry-pick bug fixes from v0.9.0 to main.
      
      **Major Revision**
      - CI/CD: pipeline - clean more disk space to fix rocm building image
      pipeline(#555 )
      - Benchmarks: bug fix - use absolute path for input file in
      DirectXEncodingLatency(#554)
      - CI/CD - add push win docker image on release branch in pipeline (#552)
      - Docs - Upgrade version and release note(#557)
      e1df877b
  11. 05 Jul, 2023 3 commits
  12. 28 Jun, 2023 1 commit
  13. 14 Apr, 2023 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.8.0 (#517) · 51761b3a
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.8.0 to main.
      
      **Major Revisions**
      
      * Monitor - Fix the cgroup version checking logic (#502)
      * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
      * Fix wrong torch usage in communication wrapper for Distributed
      Inference Benchmark (#505)
      * Analyzer: Fix bug in python3.8 due to pandas api change (#504)
      * Bug - Fix bug to get metric from cmd when error happens (#506)
      * Monitor - Collect realtime GPU power when benchmarking (#507)
      * Add num_workers argument in model benchmark (#511)
      * Remove unreachable condition when write host list (#512)
      * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
      * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
      * Docs - Upgrade version and release note (#508)
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      51761b3a
  14. 23 Feb, 2023 1 commit
  15. 29 Dec, 2022 1 commit
  16. 18 Oct, 2022 1 commit
  17. 06 Jul, 2022 1 commit
    • Yifan Xiong's avatar
      Update dependencies and Dockerfile (#371) · 9f03d568
      Yifan Xiong authored
      Update dependencies and Dockerfile:
      * upgrade nccl-tests and rccl-tests to current latest version to match
        NCCL/RCCL versions
      * unify image tag names on DockerHub
      * remove verbose output in Dockerfile and minor fix some flags
      9f03d568
  18. 19 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Update ROCm Dockerfile (#361) · 483bf782
      Yifan Xiong authored
      **Description**
      
      Update ROCm Dockerfile.
      
      **Major Revisions**
      - Add dockerfile for ROCm 5.1.3
      - Merge 5.1.x and 5.0.x dockerfile
      - Remove 4.2 and 4.0 legacy
      - Update build pipeline accordingly
      483bf782
  19. 25 May, 2022 1 commit
  20. 28 Feb, 2022 1 commit
  21. 25 Feb, 2022 1 commit
  22. 08 Feb, 2022 1 commit
  23. 11 Oct, 2021 1 commit
  24. 26 Sep, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.3.0 (#212) · dfbd70b1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.3.0 to main.
      
      **Major Revisions**
      * Docs - Upgrade version and release note (#209)
      * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
      * Benchmarks: Update - Update benchmarks in configuration file (#208)
      * CI/CD - Update GitHub Action VM (#211)
      * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
      * CI/CD - Fix bug in build image for push event (#205)
      * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
      * Tool: Fix bug - Fix function naming issue in system info  (#200)
      * CI/CD - Push images in GitHub Action (#202)
      * Bug - Fix torch.distributed command for single node (#201)
      * CLI - Integrate system info for node (#199)
      * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
      * CI/CD - Add ROCm image build in GitHub Actions (#194)
      * Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
      * Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
      * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
      * Bug - Revise 'docker run' in sb deploy (#195)
      * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
      Co-authored-by: default avatarYuting Jiang <v-yujiang@microsoft.com>
      Co-authored-by: default avatarGuoshuai Zhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      dfbd70b1
  25. 09 Jul, 2021 1 commit
  26. 25 Jun, 2021 1 commit
  27. 16 Jun, 2021 1 commit
    • Yifan Xiong's avatar
      Dockerfile - Update CUDA 11.1.1 Dockerfile (#96) · 25ec3a7c
      Yifan Xiong authored
      Update packages and add build cache for CUDA 11.1.1 Dockerfile:
      
      * Remove duplicate cmake and ompi, which are already in base image
      * Add hpcx and sharp lib
      * Add cache for gitmodules build
      * Sort apt-get packages
      25ec3a7c
  28. 01 Jun, 2021 1 commit
  29. 17 May, 2021 2 commits
  30. 14 Apr, 2021 1 commit
  31. 28 Jan, 2021 1 commit
    • Yifan Xiong's avatar
      Setup: Init - Initialize setup.py and basic configs (#4) · 5be32481
      Yifan Xiong authored
      Initialize setup.py and basic configurations for this project.
      
      Major revisions:
      
      - initialize setup.py for Python package
      - add gitignore and dockerignore
      - add editorconfig for editors
      - configure yapf for auto formating
      - configure mypy for type hint
      - configure flake8 for lint, including quotes and docstrings
      - add pre-commit check for `git commit`
      - add spelling check in GitHub Actions
      - format existing files according to configured rules
      
      Example usage:
      
          # install dependencies
          $ python3 -m pip install -e .[dev,test]
          $ pre-commit install
      
          # format code automatically
          $ python3 setup.py format
      
          # lint code
          $ python3 setup.py lint
      
          # test code
          $ python3 setup.py test
      5be32481