Analyzer - Add support to store values of metrics in data diagnosis (#392)
**Description**
Add support to store values of metrics in data diagnosis.
Take the following rules as example:
```
nccl_store_rule:
categories: NCCL_DIS
store: True
metrics:
- nccl-bw:allreduce-run0/allreduce_1073741824_busbw
- nccl-bw:allreduce-run1/allreduce_1073741824_busbw
- nccl-bw:allreduce-run2/allreduce_1073741824_busbw
- nccl-bw:allreduce-run3/allreduce_1073741824_busbw
- nccl-bw:allreduce-run4/allreduce_1073741824_busbw
nccl_rule:
function: multi_rules
criteria: 'lambda label:True if min(label["nccl_store_rule"].values())/max(label["nccl_store_rule"].values())<0.95 else False'
categories: NCCL_DIS
```
**nccl_store_rule** will store the values of the metrics in dict and save them into `label["nccl_store_rule"]` , and then **rccl_rule** can use the values of metrics through `label["nccl_store_rule"].values()` in criteria
Showing
Please register or sign in to comment