--- id: baseline-generation --- # Baseline Generation ## Introduction This tool is to generate a baseline json file based on the raw benchmark results of multiple machines. ## Usage 1. [Install SuperBench](../getting-started/installation.mdx) on the local machine. 2. Prepare the raw data and rule files on the local machine. 3. Generate the baseline file automatically using `sb result generate-baseline` command. The detailed command can be found from [SuperBench CLI](../cli.md). ```bash sb result generate-baseline --data-file ./results-summary.jsonl --summary-rule-file ./summary-rule.yaml --diagnosis-rule-file ./diagnosis-rule.yaml --output-dir ${output-dir} ``` 4. Find the output result file named 'baseline.json' under ${output_dir}. ## Input The input includes 4 files: - **Raw Data**: jsonl file including multiple nodes' results automatically generated by SuperBench runner. :::tip Tips Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run. ::: - **Summary Rule File**: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied. - **Diagnosis Rule File(optional)**: It uses YAML format and includes each metrics' rules to filter defective machines for diagnosis, and will not filter machines if not specified. - **Previous Baseline File(optional)**: It is baseline file in json format that got from previous run and plan to merge into the latest baseline. ### Rule File **Summary Rule File** is the same with rule file defined in [Result Summary](./result-summary.md) **Diagnosis Rule File** is the same with rule file defined in [Data Diagnosis](./data-diagnosis.md) ## Output The baseline file (baseline.json) from multiple machines will be generated under ${output_dir}.