1. 08 Jan, 2024 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.10.0 (#607) · 2c88db90
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.10.0 to main.
      
      **Major Revisions**
      
      * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
      * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
      * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
      * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
      * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
      * CI/CD - Add ndv5 topo file #597
      * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
      * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
      * Dockerfile - Bug fix for rocm docker build and deploy #598
      * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
      * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
      * Monitor - Upgrade pyrsmi to amdsmi python library. #601
      * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
      * Dockerfile - Add rocm6.0 dockerfile #602
      * Bug Fix - Bug fix for latest megatron-lm benchmark #600
      * Docs - Upgrade version and release note #606
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYang Wang <yangwang1@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      2c88db90
  2. 10 Dec, 2023 1 commit
  3. 09 Dec, 2023 1 commit
  4. 08 Dec, 2023 1 commit
  5. 07 Dec, 2023 1 commit
  6. 04 Dec, 2023 1 commit
  7. 22 Nov, 2023 1 commit
  8. 27 Jul, 2023 1 commit
    • Yuting Jiang's avatar
      Release - SuperBench v0.9.0 (#558) · e1df877b
      Yuting Jiang authored
      **Description**
      Cherry-pick bug fixes from v0.9.0 to main.
      
      **Major Revision**
      - CI/CD: pipeline - clean more disk space to fix rocm building image
      pipeline(#555 )
      - Benchmarks: bug fix - use absolute path for input file in
      DirectXEncodingLatency(#554)
      - CI/CD - add push win docker image on release branch in pipeline (#552)
      - Docs - Upgrade version and release note(#557)
      e1df877b
  9. 30 Jun, 2023 1 commit
  10. 29 Jun, 2023 1 commit
    • Yuting Jiang's avatar
      Tools - Add runner for sys info and update docs (#532) · ed027e4c
      Yuting Jiang authored
      **Description**
      Add runner for sys info to automatically collect on multiple nodes and
      update related docs.
      
      **Major Revision**
      - add runner for sys info which will check docker status and run `sb
      node info` on all nodes' docker and fetch results from all nodes
      
      **Minor Revision**
      - update cli and system-info doc
      - update sb node info to save output info output-dir/sys-info.json
      ed027e4c
  11. 04 May, 2023 1 commit
  12. 28 Apr, 2023 1 commit
  13. 14 Apr, 2023 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.8.0 (#517) · 51761b3a
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.8.0 to main.
      
      **Major Revisions**
      
      * Monitor - Fix the cgroup version checking logic (#502)
      * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
      * Fix wrong torch usage in communication wrapper for Distributed
      Inference Benchmark (#505)
      * Analyzer: Fix bug in python3.8 due to pandas api change (#504)
      * Bug - Fix bug to get metric from cmd when error happens (#506)
      * Monitor - Collect realtime GPU power when benchmarking (#507)
      * Add num_workers argument in model benchmark (#511)
      * Remove unreachable condition when write host list (#512)
      * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
      * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
      * Docs - Upgrade version and release note (#508)
      Co-authored-by: default avatarguoshzhao <guzhao@microsoft.com>
      Co-authored-by: default avatarZiyue Yang <ziyyang@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      51761b3a
  14. 24 Mar, 2023 1 commit
  15. 22 Mar, 2023 1 commit
  16. 21 Mar, 2023 1 commit
  17. 13 Feb, 2023 1 commit
  18. 28 Jan, 2023 1 commit
  19. 04 Jan, 2023 2 commits
  20. 03 Jan, 2023 2 commits
  21. 30 Dec, 2022 1 commit
    • Yuting Jiang's avatar
      Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445) · 9dfefce3
      Yuting Jiang authored
      **Description**
      Add stdout logging util module and enable real-time logging flushing in executor
      
      **Major Revision**
      - Add stdout logging util module to redirect stdout into file log
      - enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log`
      - enable real-time log flushing in run_command of microbenchmarks through config `log_flushing`
      
      **Minor Revision**
      - add log_n_step args to enable regular step time log in model benchmarks 
      - udpate related docs
      9dfefce3
  22. 29 Dec, 2022 1 commit
  23. 29 Nov, 2022 1 commit
    • Yang Wang's avatar
      Runner - support 'pattern' in 'mpi' mode to run tasks in parallel (#430) · e4eeda0a
      Yang Wang authored
      * add mpi-parallels mode
      
      * update according to comments
      
      * fix and update doc
      
      * update
      
      * merge into 'mpi' mode
      
      * udpate according to comments
      
      * fix testcases
      
      * fix ansible
      
      * regard pattern as field
      
      * udpate
      
      * fix flake8 version
      
      * add flake8 range
      
      * remove map-by from host config
      
      * udpate comments
      e4eeda0a
  24. 06 Sep, 2022 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.6.0 (#409) · 63e9b2d1
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick bug fixes from v0.6.0 to main.
      
      **Major Revisions**
      
      * Enable latency test in ib traffic validation distributed benchmark (#396)
      * Enhance parameter parsing to allow spaces in value (#397)
      * Update apt packages in dockerfile (#398)
      * Upgrade colorlog for NO_COLOR support (#404)
      * Analyzer - Update error handling to support exit code of sb result diagnosis (#403)
      * Analyzer - Make baseline file optional in data diagnosis and fix bugs (#399)
      * Enhance timeout cleanup to avoid possible hanging (#405)
      * Auto generate ibstat file by pssh (#402)
      * Analyzer - Format int type and unify empty value to N/A in diagnosis output file (#406)
      * Docs - Upgrade version and release note (#407)
      * Docs - Fix issues in document (#408)
      Co-authored-by: default avatarYang Wang <yangwang1@microsoft.com>
      Co-authored-by: default avatarYuting Jiang <yutingjiang@microsoft.com>
      63e9b2d1
  25. 22 Aug, 2022 1 commit
  26. 17 Aug, 2022 1 commit
    • Yifan Xiong's avatar
      Update Python setup for require packages (#387) · 626ac0a4
      Yifan Xiong authored
      __Description__
      
      Update Python setup for require packages.
      
      __Major Revisions__
      * downgrade requests version to be compatible with python 3.6, add corresponding pipeline for 3.6
      * add extra entry in extras_require for nested packages
      * update `pip install` contents accordingly
      626ac0a4
  27. 26 Jul, 2022 1 commit
    • Jie Zhang's avatar
      Support topo-aware IB performance validation (#373) · ef4d6574
      Jie Zhang authored
      
      
      * Support topo-aware IB performance validation
      
      Add a new pattern `topo-aware`, so the user can run IB performance
      test based on VM's topology information. This way, the user can
      validate the IB performance across VM pairs with different distance
      as a quick test instead of pair-wise test.
      
      To run with topo-aware pattern, user needs to specify three required
      (and two optional) parameters in YAML config file:
      --pattern	topo-aware
      --ibstat	path to ibstat output
      --ibnetdiscover	path to ibnetdiscover output
      --min_dist	minimum distance of VM pairs (optional, default 2)
      --max_dist	maximum distance of VM pairs (optional, default 6)
      
      The newly added topo_aware module then parses the topology
      information, builds a graph, and generates the VM pairs with
      the specified distance (# hops).
      
      The specified IB test will then be running across these
      generated VM pairs.
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Add description about topology aware ib traffic tests
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Add unit test to verify generated topology aware config file
      
      This commit adds unit test to verify the generated topology aware
      config file is correct. To do so, four new data files are added in
      order to invoke gen_topo_aware_config function to generate topology
      aware config file, then compares it with the expected config file.
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      
      * Fix lint issue on Azure pipeline
      Signed-off-by: default avatarJie Zhang <jessezhang1010@gmail.com>
      ef4d6574
  28. 08 Jul, 2022 1 commit
    • Yifan Xiong's avatar
      Support node_num=1 in mpi mode (#372) · e00a8180
      Yifan Xiong authored
      Support `node_num: 1` in mpi mode, so that we can run mpi benchmarks in
      both 1 node and all nodes in one config by changing `node_num`.
      Update docs and add test case accordingly.
      e00a8180
  29. 29 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Fix issues in ib loopback benchmark (#369) · 620192a2
      Yifan Xiong authored
      Fix several issues in ib loopback benchmark:
      * use `--report_gbits` and divide by 8 to get GB/s, previous results are
        MiB/s / 1000
      * use the ib_write_bw binary built in third_party instead of system path
      * update the metrics name so that different hca indices have same metric
      620192a2
  30. 24 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Support multiple IB/GPU in ib validation (#363) · bfaa1c83
      Yifan Xiong authored
      **Description**
      
      Support multiple IB/GPU devices run simultaneously in ib validation benchmark.
      
      **Major Revisions**
      - Revise ib_validation_performance.cc so that multiple processes per node could be used to launch multiple perftest commands simultaneously. For each node pair in the config, number of processes per node will run in parallel.
      - Revise ib_validation_performance.py to correct file paths and adjust parameters to specify different NICs/GPUs/NUMA nodes.
      - Fix env issues in Dockerfile for end-to-end test.
      - Update ib-traffic configuration examples in config files.
      - Update unit tests and docs accordingly.
      
      Closes #326.
      bfaa1c83
  31. 14 Jun, 2022 1 commit
    • Yifan Xiong's avatar
      Support `sb run` on host directly without Docker (#358) · a4937e95
      Yifan Xiong authored
      **Description**
      
      Support `sb run` on host directly without Docker
      
      **Major Revisions**
      - Add `--no-docker` argument for `sb run`.
      - Run on host directly if `--no-docker` if specified.
      - Update docs and tests correspondingly.
      a4937e95
  32. 29 Apr, 2022 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.5.0 (#350) · 6681c720
      Yifan Xiong authored
      
      
      **Description**
      
      Cherry-pick  bug fixes from v0.5.0 to main.
      
      **Major Revisions**
      
      * Bug - Force to fix ort version as '1.10.0' (#343)
      * Bug - Support no matching rules and unify the output name in result_summary (#345)
      * Analyzer - Support regex in annotations of benchmark naming for metrics in rules (#344)
      * Bug - Fix bugs in sync results on root rank for e2e model benchmarks (#342)
      * Bug - Fix bug of duration feature for model benchmarks in distributed mode (#347)
      * Docs - Upgrade version and release note (#348)
      Co-authored-by: default avatarYuting Jiang <v-yutjiang@microsoft.com>
      6681c720
  33. 20 Apr, 2022 1 commit
  34. 15 Apr, 2022 1 commit
  35. 11 Apr, 2022 1 commit
  36. 08 Apr, 2022 2 commits
  37. 01 Apr, 2022 1 commit