1. 28 Jan, 2023 1 commit
  2. 01 Nov, 2022 1 commit
    • Yifan Xiong's avatar
      CLI - Add non-zero return code for `sb [deploy,run]` (#425) · 1b86503d
      Yifan Xiong authored
      Add non-zero return code for `sb deploy` and `sb run` command when
      there're Ansible failures in control plane.
      Return code is set to count of failure.
      
      For failures caused by benchmarks, return code is still set per benchmark
      in results json file.
      1b86503d
  3. 31 Oct, 2022 1 commit
  4. 06 Sep, 2022 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.6.0 (#409) · 63e9b2d1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.6.0 to main.
      
      **Major Revisions**
      
      * Enable latency test in ib traffic validation distributed benchmark (#396)
      * Enhance parameter parsing to allow spaces in value (#397)
      * Update apt packages in dockerfile (#398)
      * Upgrade colorlog for NO_COLOR support (#404)
      * Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
      * Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
      * Enhance timeout cleanup to avoid possible hanging (#405)
      * Auto generate ibstat file by pssh (#402)
      * Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
      * Docs - Upgrade version and release note (#407)
      * Docs - Fix issues in document (#408)
      Co-authored-by: default avatarYang Wang <yangwang1@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      63e9b2d1
  5. 22 Aug, 2022 1 commit
  6. 14 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Support `sb run` on host directly without Docker (#358) · a4937e95
      Yifan Xiong authored
      **Description**
      
      Support `sb run` on host directly without Docker
      
      **Major Revisions**
      - Add `--no-docker` argument for `sb run`.
      - Run on host directly if `--no-docker` if specified.
      - Update docs and tests correspondingly.
      a4937e95
  7. 11 Apr, 2022 1 commit
  8. 08 Apr, 2022 1 commit
  9. 30 Jan, 2022 1 commit
  10. 18 Jan, 2022 1 commit
    • Yifan Xiong's avatar
      CLI - Add command sb benchmark [list,list-parameters] (#279) · f7ffc545
      Yifan Xiong authored
      __Description__
      
      Add command `sb benchmark list` and `sb benchmark list-parameters` to support listing all optional parameters for benchmarks.
      
      <details>
      <summary>Examples</summary>
      <pre>
      $ sb benchmark list -n [a-z]+-bw -o table
      Result
      --------
      mem-bw
      nccl-bw
      rccl-bw
      </pre>
      <pre>
      $ sb benchmark list-parameters -n mem-bw
      === mem-bw ===
      optional arguments:
        --bin_dir str         Specify the directory of the benchmark binary.
        --duration int        The elapsed time of benchmark in seconds.
        --mem_type str [str ...]
                              Memory types to benchmark. E.g. htod dtoh dtod.
        --memory str          Memory argument for bandwidthtest. E.g. pinned unpinned.
        --run_count int       The run count of benchmark.
        --shmoo_mode          Enable shmoo mode for bandwidthtest.
      default values:
      {'bin_dir': None,
       'duration': 0,
       'mem_type': ['htod', 'dtoh'],
       'memory': 'pinned',
       'run_count': 1}
      </pre>
      </details>
      
      __Major Revisions__
      * Add `sb benchmark list` to list benchmarks matching given name.
      * Add `sb benchmark list-parameters` to list parameters for benchmarks which match given name.
      
      __Minor Revisions__
      * Sort format help text for argparse.
      f7ffc545
  11. 10 Dec, 2021 1 commit
  12. 26 Sep, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.3.0 (#212) · dfbd70b1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.3.0 to main.
      
      **Major Revisions**
      * Docs - Upgrade version and release note (#209)
      * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210)
      * Benchmarks: Update - Update benchmarks in configuration file (#208)
      * CI/CD - Update GitHub Action VM (#211)
      * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203)
      * CI/CD - Fix bug in build image for push event (#205)
      * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204)
      * Tool: Fix bug - Fix function naming issue in system info  (#200)
      * CI/CD - Push images in GitHub Action (#202)
      * Bug - Fix torch.distributed command for single node (#201)
      * CLI - Integrate system info for node (#199)
      * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196)
      * CI/CD - Add ROCm image build in GitHub Actions (#194)
      * Bug: Fix bug - fix bug of hipBusBandwidth build (#193)
      * Benchmarks: Build Pipeline - Restore rocblas build logic (#197)
      * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198)
      * Bug - Revise 'docker run' in sb deploy (#195)
      * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190)
      Co-authored-by: default avatarYuting Jiang <v-yujiang@microsoft.com>
      Co-authored-by: default avatarGuoshuai Zhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      dfbd70b1
  13. 01 Jul, 2021 2 commits
  14. 23 Jun, 2021 1 commit
    • Yifan Xiong's avatar
      Bug bash - Fix bugs in multi GPU benchmarks (#98) · c0c43b8f
      Yifan Xiong authored
      * Add `sb deploy` command content.
      * Fix inline if-expression syntax in playbook.
      * Fix quote escape issue in bash command.
      * Add custom env in config.
      * Update default config for multi GPU benchmarks.
      * Update MANIFEST.in to include jinja2 template.
      * Require jinja2 minimum version.
      * Fix occasional duplicate output in Ansible runner.
      * Fix mixed color from Ansible and Python colorlog.
      * Update according to comments.
      * Change superbench.env from list to dict in config file.
      c0c43b8f
  15. 18 May, 2021 1 commit
  16. 12 Apr, 2021 1 commit
  17. 26 Mar, 2021 1 commit
  18. 12 Mar, 2021 1 commit