Analyzer - Fix bugs in data diagnosis (#355)

**Description** Fix bugs in data diagnosis. **Major Revision** - add support to get baseline of the metric which uses custom benchmark naming with ':' like 'nccl-bw:default/allreduce_8_bw:0' - save raw data of all metrics rather than metrics defined in diagnosis_rules.yaml when output_all is True - fix bug of using wrong column index when applying format(red color and percentile) in the excel

Analyzer - Fix bugs in data diagnosis (#355)
**Description** Fix bugs in data diagnosis. **Major Revision** - add support to get baseline of the metric which uses custom benchmark naming with ':' like 'nccl-bw:default/allreduce_8_bw:0' - save raw data of all metrics rather than metrics defined in diagnosis_rules.yaml when output_all is True - fix bug of using wrong column index when applying format(red color and percentile) in the excel
54da021b · user4543 · GitHub · 3f135e46 · 54da021b · 54da021b
Unverified Commit 54da021b authored Jun 01, 2022 by user4543 Committed by GitHub Jun 01, 2022
3 changed files
--- a/superbench/analyzer/data_diagnosis.py
+++ b/superbench/analyzer/data_diagnosis.py
@@ -63,8 +63,8 @@ class DataDiagnosis(RuleBase):
        if metric in baseline:
            return baseline[metric]
        else:
-            # exclude rank info
-            short = metric.split(':')[0]
+            # exclude rank info, for example, '.*:\d+'->'.*'
+            short = metric.strip(metric.split(':')[-1]).strip(':')
            if short in baseline:
                return baseline[short]
            # baseline not defined
@@ -221,7 +221,7 @@ class DataDiagnosis(RuleBase):
            DataFrame: all nodes' detailed information inluding ['Accept','#Issues','Category','Issue_Details']
        """
        append_columns = ['Accept', '#Issues', 'Category', 'Issue_Details']
-        all_data_df = (raw_data_df[self._enable_metrics]).astype('float64')
+        all_data_df = (raw_data_df).astype('float64')

        if data_not_accept_df.shape[0] == 0:
            all_data_df['Accept'] = [True for i in range(len(all_data_df))]

--- a/superbench/analyzer/file_handler.py
+++ b/superbench/analyzer/file_handler.py
@@ -120,7 +120,8 @@ def output_excel_data_not_accept(writer, data_not_accept_df, rules):

            for rule in rules:
                for metric in rules[rule]['metrics']:
-                    col_index = columns.index(metric)
+                    # The column index of the metrics should start from 1
+                    col_index = columns.index(metric) + 1
                    # Apply percent format for the columns whose rules are variance type.
                    if rules[rule]['function'] == 'variance':
                        worksheet.conditional_format(

--- a/tests/data/diagnosis_summary.json
+++ b/tests/data/diagnosis_summary.json