1. 22 Aug, 2021 1 commit
  2. 20 Aug, 2021 2 commits
    • guoshzhao's avatar
      Runner: Add Feature - Generate summarized output files. (#157) · 7595d794
      guoshzhao authored
      **Description**
      Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`
      
      **Major Revision**
      - Generate the summarized json file per node:
      For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
      For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
      `[]` means optional.
      ```
      {
        "kernel-launch/overhead_event:0": 0.00583,
        "kernel-launch/overhead_event:1": 0.00545,
        "kernel-launch/overhead_event:2": 0.00581,
        "kernel-launch/overhead_event:3": 0.00572,
        "kernel-launch/overhead_event:4": 0.00559,
        "kernel-launch/overhead_event:5": 0.00591,
        "kernel-launch/overhead_event:6": 0.00562,
        "kernel-launch/overhead_event:7": 0.00586,
        "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
        "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
        "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
        "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
        "pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
        "pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
        "pytorch-sharding-matmul/0/allgather": 10.088025093078613,
        "pytorch-sharding-matmul/1/allgather": 10.088025093078613
      }
      ```
      - Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.
      7595d794
    • Yuting Jiang's avatar
      Benchmarks: Build Pipeline - Add build logic of hipBusBandwidth in third_party (#151) · a1e5c90d
      Yuting Jiang authored
      **Description**
      Add build logic of hipBusBandwidth in third_party.
      
      **Major Revision**
      - Add build logic of hipBusBandwidth in third_party
      a1e5c90d
  3. 19 Aug, 2021 1 commit
  4. 16 Aug, 2021 2 commits
  5. 12 Aug, 2021 1 commit
  6. 09 Aug, 2021 1 commit
  7. 06 Aug, 2021 2 commits
  8. 05 Aug, 2021 1 commit
  9. 02 Aug, 2021 2 commits
  10. 30 Jul, 2021 1 commit
  11. 29 Jul, 2021 3 commits
  12. 27 Jul, 2021 2 commits
  13. 26 Jul, 2021 1 commit
  14. 23 Jul, 2021 2 commits
  15. 21 Jul, 2021 1 commit
  16. 20 Jul, 2021 2 commits
  17. 19 Jul, 2021 1 commit
  18. 16 Jul, 2021 2 commits
  19. 15 Jul, 2021 1 commit
  20. 13 Jul, 2021 2 commits
  21. 09 Jul, 2021 2 commits
  22. 08 Jul, 2021 1 commit
  23. 02 Jul, 2021 2 commits
    • Yifan Xiong's avatar
      Docs - Update README and version for v0.2.0 release (#111) · 43620c3f
      Yifan Xiong authored
      Update README and version for v0.2 release.
      43620c3f
    • Yifan Xiong's avatar
      Runner - Fetch benchmarks results on all nodes (#116) · fb7d4a73
      Yifan Xiong authored
      Fetch benchmarks results on all nodes, will rsync after each benchmark.
      The results directory structure on control node is as follows:
      
      ```
      outputs/
      └── datetime
          ├── nodes
          │   └── node-0
          │       ├── benchmarks
          │       │   ├── benchmark-0
          │       │   │   ├── rank-0
          │       │   │   │   └── results.json
          │       └── sb-exec.log
          ├── sb-run.log
          └── sb.config.yaml
      ```
      fb7d4a73
  24. 01 Jul, 2021 2 commits
  25. 30 Jun, 2021 2 commits