• guoshzhao's avatar
    Runner: Add Feature - Generate summarized output files. (#157) · 7595d794
    guoshzhao authored
    **Description**
    Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`
    
    **Major Revision**
    - Generate the summarized json file per node:
    For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
    For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
    `[]` means optional.
    ```
    {
      "kernel-launch/overhead_event:0": 0.00583,
      "kernel-launch/overhead_event:1": 0.00545,
      "kernel-launch/overhead_event:2": 0.00581,
      "kernel-launch/overhead_event:3": 0.00572,
      "kernel-launch/overhead_event:4": 0.00559,
      "kernel-launch/overhead_event:5": 0.00591,
      "kernel-launch/overhead_event:6": 0.00562,
      "kernel-launch/overhead_event:7": 0.00586,
      "resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
      "resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
      "resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
      "resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
      "pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
      "pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
      "pytorch-sharding-matmul/0/allgather": 10.088025093078613,
      "pytorch-sharding-matmul/1/allgather": 10.088025093078613
    }
    ```
    - Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.
    7595d794
__init__.py 1.02 KB