--- id: result-summary --- # Result Summary ## Introduction This tool is to generate a readable summary report based on the raw benchmark results of single or multiple machines. ## Usage 1. [Install SuperBench](../getting-started/installation) on the local machine. 2. Prepare the raw data and rule file on the local machine. 3. Generate the result summary automatically using `sb result summary` command. The detailed command can be found from [SuperBench CLI](../cli). ```bash sb result summary --data-file ./results-summary.jsonl --rule-file ./rule.yaml --output-file-format md --output-dir ${output-dir} ``` 4. Find the output result file named 'results_summary.md' under ${output_dir}. ## Input The input includes 2 files: - **Raw Data**: jsonl file including multiple nodes' results automatically generated by SuperBench runner. :::tip Tips Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run. ::: - **Rule File**: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied. ### Rule File This section describes how to write rules in **rule file**. The convention is the same as [SuperBench Config File](../superbench-config), please view it first. Here is an overview of the rule file structure: ```yaml title="Scheme" version: string superbench: rules: ${rule_name}: statistics: - ${statistic_name} categories: string aggregate: (optional)[bool|string] metrics: - ${benchmark_name}/regex - ${benchmark_name}/regex ``` ```yaml title="Example" # SuperBench rules version: v0.4 superbench: rules: kernel_launch: statistics: - mean - p90 - min - max aggregate: True categories: KernelLaunch metrics: - kernel-launch/event_overhead - kernel-launch/wall_overhead nccl: statistics: mean categories: NCCL metrics: - nccl-bw/allreduce_8388608_busbw ib-loopback: statistics: mean categories: RDMA metrics: - ib-loopback/IB_write_8388608_Avg_\d+ aggregate: ib-loopback/IB_write_.*_Avg_(\d+) ``` This rule file describes the rules used for the result summary. They are organized by the rule name and each rule mainly includes several elements: #### `metrics` The list of metrics for this rule. Each metric is in the format of ${benchmark_name}/regex, you can use regex after the first '/', but to be noticed, the benchmark name can not be a regex. #### `categories` User-defined category name in string belongs to the rule, which is used to classify and organize the metrics. #### `aggregate` This item is used to determine whether to aggregate the benchmark results from multiple devices to treat them as one collection. For example, aggregate the results of kernel-launch overhead from 8 GPU devices into one collection. The value of this item should be bool or pattern string with regex​: - bool: - `False`(default): if no aggregation. - `True`: aggregate the results of multiple ranks. In detail, the metric names in `metrics` like 'metric:\\d+' will be aggregated and turned into 'metric' for most microbenchmark metrics. - pattern string with regex: aggregate the results using the pattern string, which is used to match the metric names in `metrics`. In detail, the part of the metric that matches the contents of () in the pattern string will be turned into *, other parts of the metric remain unchanged. #### `statistics` A list of statistical functions is used for this rule to get the results statistics from multiple nodes/ranks. The following illustrates all statistical functions: - `count` - `max` - `mean` - `min` - `p${value}`: ${value} can be 1-99. For example, p50, p90, etc. - `std`