Docs - Add usage for result summary (#337)

**Description** Add usage for result summary.

Docs - Add usage for result summary (#337)
**Description** Add usage for result summary.
56c9a711 · user4543 · GitHub · f15da60b · 56c9a711 · 56c9a711
Unverified Commit 56c9a711 authored Apr 09, 2022 by user4543 Committed by GitHub Apr 08, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 125 additions and 0 deletions

docs/user-tutorial/result-summary.md docs/user-tutorial/result-summary.md +124 -0

website/sidebars.js website/sidebars.js +1 -0

No files found.
--- a/docs/user-tutorial/result-summary.md
+++ b/docs/user-tutorial/result-summary.md
+---
+id: result-summary
+---
+
+# Result Summary
+
+## Introduction
+
+This tool is to generate a readable summary report based on the raw benchmark results of single or multiple machines.
+
+## Usage
+
+1. [Install SuperBench](../getting-started/installation) on the local machine.
+
+2. Prepare the raw data and rule file on the local machine.
+
+3. Generate the result summary automatically using `sb result summary` command. The detailed command can be found from [SuperBench CLI](../cli).
+
+  ```bash
+  sb result summary --data-file ./results-summary.jsonl --rule-file ./rule.yaml --output-file-format md --output-dir ${output-dir}
+  ```
+
+4. Find the output result file named 'results_summary.md' under ${output_dir}.
+
+## Input
+
+The input includes 2 files:
+
+
+
+- **Raw Data**: jsonl file including multiple nodes' results automatically generated by SuperBench runner.
+
+:::tip Tips
+Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run.
+:::
+
+- **Rule File**: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied.
+
+### Rule File
+
+This section describes how to write rules in **rule file**.
+
+The convention is the same as [SuperBench Config File](../superbench-config), please view it first.
+
+Here is an overview of the rule file structure:
+
+```yaml title="Scheme"
+version: string
+superbench:
+  rules:
+    ${rule_name}:
+      statistics:
+        - ${statistic_name}
+      categories: string
+      aggregate: (optional)[bool|string]
+      metrics:
+        - ${benchmark_name}/regex
+        - ${benchmark_name}/regex
+```
+
+```yaml title="Example"
+# SuperBench rules
+version: v0.4
+superbench:
+  rules:
+    kernel_launch:
+      statistics:
+        - mean
+        - p90
+        - min
+        - max
+      aggregate: True
+      categories: KernelLaunch
+      metrics:
+        - kernel-launch/event_overhead
+        - kernel-launch/wall_overhead
+    nccl:
+      statistics: mean
+      categories: NCCL
+      metrics:
+        - nccl-bw/allreduce_8388608_busbw
+    ib-loopback:
+      statistics: mean
+      categories: RDMA
+      metrics:
+        - ib-loopback/IB_write_8388608_Avg_\d+
+      aggregate: ib-loopback/IB_write_.*_Avg_(\d+)
+```
+
+This rule file describes the rules used for the result summary.
+
+They are organized by the rule name and each rule mainly includes several elements:
+
+#### `metrics`
+
+The list of metrics for this rule. Each metric is in the format of ${benchmark_name}/regex, you can use regex after the first '/', but to be noticed, the benchmark name can not be a regex.
+
+#### `categories`
+
+User-defined category name in string belongs to the rule, which is used to classify and organize the metrics.
+
+#### `aggregate`
+
+This item is used to determine whether to aggregate the benchmark results from multiple devices to treat them as one collection.
+For example, aggregate the results of kernel-launch overhead from 8 GPU devices into one collection.
+
+The value of this item should be bool or pattern string with regex:
+
+- bool:
+  - `False`(default): if no aggregation.
+  - `True`: aggregate the results of multiple ranks. In detail, the metric names in `metrics` like 'metric:\\d+' will be aggregated and turned into 'metric' for most microbenchmark metrics.
+- pattern string with regex: aggregate the results using the pattern string, which is used to match the metric names in `metrics`. In detail, the part of the metric that matches the contents of () in the pattern string will be turned into *, other parts of the metric remain unchanged.
+
+#### `statistics`
+
+A list of statistical functions is used for this rule to get the results statistics from multiple nodes/ranks.
+
+The following illustrates all statistical functions:
+- `count`
+- `max`
+- `mean`
+- `min`
+- `p${value}`: ${value} can be 1-99. For example, p50, p90, etc.
+- `std`
--- a/website/sidebars.js
+++ b/website/sidebars.js
@@ -32,6 +32,7 @@ module.exports = {
        },
        'user-tutorial/system-config',
        'user-tutorial/data-diagnosis',
+        'user-tutorial/result-summary',
        'user-tutorial/monitor',
        'user-tutorial/container-images',
      ],