1. 24 Sep, 2021 1 commit
  2. 23 Sep, 2021 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Update - Update benchmarks in configuration file (#208) · a58f218b
      Yuting Jiang authored
      **Description**
      Update benchmarks in configuration files for single node validation of superbench v0.3.
      
      **Major Revision**
      - fix bugs of parameters in nccl-bw for single node validation in configs
      - update new benchmarks in amd_mi100_hpe.yaml, amd_mi100_z53.yaml, azure_ndv4.yaml
      - fix bug of wrong gpu visible prefix
      a58f218b
  3. 18 Sep, 2021 3 commits
  4. 17 Sep, 2021 1 commit
  5. 16 Sep, 2021 1 commit
  6. 14 Sep, 2021 1 commit
  7. 13 Sep, 2021 2 commits
  8. 09 Sep, 2021 1 commit
  9. 06 Sep, 2021 1 commit
  10. 03 Sep, 2021 1 commit
    • Yuting Jiang's avatar
      Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode... · 60762518
      Yuting Jiang authored
      Benchmarks: Code Revision - Revise arguments of nccl/rccl to support mpi mode and rename metric (#189)
      
      **Description**
      Revise arguments of nccl/rccl to support mpi mode for (mpi can not run in nccl/rccl due to multiple operators run in sequence without barrier) and rename metric .
      
      **Major Revision**
      - revise argument operators to be a single one
      
      **Minor Revision**
      - rename metric to remove benchmark name info
      - change argument ngpus default value to be 1
      60762518
  11. 02 Sep, 2021 3 commits
  12. 01 Sep, 2021 2 commits
  13. 31 Aug, 2021 2 commits
  14. 30 Aug, 2021 4 commits
  15. 27 Aug, 2021 4 commits
  16. 26 Aug, 2021 1 commit
  17. 25 Aug, 2021 1 commit
  18. 22 Aug, 2021 1 commit
  19. 20 Aug, 2021 1 commit
    • guoshzhao's avatar
      Runner: Add Feature - Generate summarized output files. (#157) · 7595d794
      guoshzhao authored
      **Description**
      Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`
      
      **Major Revision**
      - Generate the summarized json file per node:
      For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
      For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
      `[]` means optional.
      ```
      {
        "kernel-launch/overhead_event:0": 0.00583,
        "kernel-launch/overhead_event:1": 0.00545,
        "kernel-launch/overhead_event:2": 0.00581,
        "kernel-launch/overhead_event:3": 0.00572,
        "kernel-launch/overhead_event:4": 0.00559,
        "kernel-launch/overhead_event:5": 0.00591,
        "kernel-launch/overhead_event:6": 0.00562,
        "kernel-launch/overhead_event:7": 0.00586,
        "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
        "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
        "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
        "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
        "pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
        "pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
        "pytorch-sharding-matmul/0/allgather": 10.088025093078613,
        "pytorch-sharding-matmul/1/allgather": 10.088025093078613
      }
      ```
      - Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.
      7595d794
  20. 19 Aug, 2021 1 commit
  21. 16 Aug, 2021 1 commit
  22. 06 Aug, 2021 2 commits
  23. 05 Aug, 2021 1 commit
  24. 30 Jul, 2021 1 commit
  25. 29 Jul, 2021 1 commit
    • Yifan Xiong's avatar
      Release - SuperBench v0.2.1 (#142) · 69b2c631
      Yifan Xiong authored
      __Description__
      Cherry-pick bug fixes from v0.2.1 to main.
      
      __Major Revisions__
      * Fix bug of VGG models failed on A100 GPU with batch_size=128.
      * Fix Ansible connection issue when running in localhost.
      * Update version in packages and docs.
      69b2c631
  26. 27 Jul, 2021 1 commit