baseline-generation.md 1.82 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
id: baseline-generation
---

# Baseline Generation

## Introduction

This tool is to generate a baseline json file based on the raw benchmark results of multiple machines.

## Usage

1. [Install SuperBench](../getting-started/installation.mdx) on the local machine.

2. Prepare the raw data and rule files on the local machine.

3. Generate the baseline file automatically using `sb result generate-baseline` command. The detailed command can be found from [SuperBench CLI](../cli.md).

  ```bash
  sb result generate-baseline --data-file ./results-summary.jsonl --summary-rule-file ./summary-rule.yaml --diagnosis-rule-file ./diagnosis-rule.yaml --output-dir ${output-dir}
  ```

4. Find the output result file named 'baseline.json' under ${output_dir}.

## Input

The input includes 4 files:

- **Raw Data**: jsonl file including multiple nodes' results automatically generated by SuperBench runner.

:::tip Tips
Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run.
:::

- **Summary Rule File**: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied.

- **Diagnosis Rule File(optional)**: It uses YAML format and includes each metrics' rules to filter defective machines for diagnosis, and will not filter machines if not specified.

- **Previous Baseline File(optional)**: It is baseline file in json format that got from previous run and plan to merge into the latest baseline.

### Rule File

**Summary Rule File** is the same with rule file defined in [Result Summary](./result-summary.md)

**Diagnosis Rule File** is the same with rule file defined in [Data Diagnosis](./data-diagnosis.md)

## Output

The baseline file (baseline.json) from multiple machines will be generated under ${output_dir}.