benchmark-checklist.md 3.79 KB
Newer Older
yangzhong's avatar
yangzhong committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

#### **1. Applicable Categories**
- Datacenter

---

#### **2. Applicable Scenarios for Each Category**
- Offline

---

#### **3. Applicable Compliance Tests**
- TEST01

---

#### **4. Latency Threshold for Server Scenarios**
- Not applicable

---

#### **5. Validation Dataset: Unique Samples**
Number of **unique samples** in the validation dataset and the QSL size specified in 
- [X] [inference policies benchmark section](https://github.com/mlcommons/inference_policies/blob/master/inference_rules.adoc#41-benchmarks)
- [X] [mlperf.conf](https://github.com/mlcommons/inference/blob/master/loadgen/mlperf.conf)
- [X] [Inference benchmark docs](https://github.com/mlcommons/inference/blob/docs/docs/index.md)
  *(Ensure QSL size overflows the system cache if possible.)*

---

#### **6. Equal Issue Mode Applicability**
Documented whether **Equal Issue Mode** is applicable in 
- [X] [mlperf.conf](https://github.com/mlcommons/inference/blob/master/loadgen/mlperf.conf#L42)
- [X] [Inference benchmark docs](https://github.com/mlcommons/inference/blob/docs/docs/index.md)
  *(Relevant if sample processing times are inconsistent across inputs.)*

---

#### **7. Expected Accuracy and `accuracy.txt` Contents**
- [X] Expected accuracy updated in the [inference policies](https://github.com/mlcommons/inference_policies/blob/master/inference_rules.adoc#41-benchmarks)
- [X] `accuracy.txt` file generated by the reference accuracy script from the MLPerf accuracy log and is validated by the submission checker.

---

#### **8. Reference Model Details**
- [X] Reference model details updated in [Inference benchmark docs](https://github.com/mlcommons/inference/blob/docs/docs/index.md)  

---

#### **9. Reference Implementation Dataset Coverage**
- [X] Reference implementation successfully processes the entire validation dataset during:
  - [X] Performance runs
  - [X] Accuracy runs
  - [X] Compliance runs  
- [X] Valid log files passing the submission checker are generated for all runs - [link](https://github.com/mlcommons/mlperf_inference_unofficial_submissions_v5.0/tree/main/closed/MLCommons/results/mlc-server-reference-gpu-pytorch_v2.4.0-cu124/rgat/offline/performance/run_1).

---

#### **10. Test Runs with Smaller Input Sets**
- [X] Verified the reference implementation can perform test runs with a smaller subset of inputs for:
  - [X] Performance runs
  - [X] Accuracy runs

---

#### **11. Dataset and Reference Model Instructions**
- [X] Clear instructions provided for:
  - [X] Downloading the dataset and reference model.
  - [X] Using the dataset and model for the benchmark.

---

#### **12. Documentation of Recommended System Requirements to run the reference implementation**
- [X] Added [here](https://github.com/mlcommons/inference/blob/docs/docs/system_requirements.yml#L44)

---

#### **13. Submission Checker Modifications**
- [X] All necessary changes made to the **submission checker** to validate the benchmark.

---

#### **14. Sample Log Files**
- [X] Include sample logs for all the applicable scenario runs:
  - [X] Offline 
    - [X] [`mlperf_log_summary.txt`](https://github.com/mlcommons/mlperf_inference_unofficial_submissions_v5.0/blob/main/closed/MLCommons/results/mlc-server-reference-gpu-pytorch_v2.4.0-cu124/rgat/offline/performance/run_1/mlperf_log_summary.txt)
    - [X] [`mlperf_log_detail.txt`](https://github.com/mlcommons/mlperf_inference_unofficial_submissions_v5.0/blob/main/closed/MLCommons/results/mlc-server-reference-gpu-pytorch_v2.4.0-cu124/rgat/offline/performance/run_1/mlperf_log_detail.txt)  
- [X] Ensure sample logs successfully pass the submission checker and applicable compliance runs. [Link](https://htmlpreview.github.io/?https://github.com/mlcommons/mlperf_inference_unofficial_submissions_v5.0/blob/refs/heads/auto-update/closed/MLCommons/results/mlc-server-reference-gpu-pytorch_v2.4.0-cu124/summary.html)