Docs: Update accuracy evaluation (#3261)

55f5fc68 · Chayenne · GitHub · c27c378a · 55f5fc68 · 55f5fc68
Unverified Commit 55f5fc68 authored Feb 02, 2025 by Chayenne Committed by GitHub Feb 02, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 3 deletions

.github/pull_request_template.md .github/pull_request_template.md +1 -1

docs/references/accuracy_evaluation.md docs/references/accuracy_evaluation.md +2 -2

No files found.
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -13,4 +13,4 @@
 - [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit).
 - [ ] Add unit tests as outlined in the [Running Unit Tests](https://docs.sglang.ai/references/contribution_guide.html#running-unit-tests-adding-to-ci).
 - [ ] Update documentation / docstrings / example tutorials as needed, according to [Writing Documentation](https://docs.sglang.ai/references/contribution_guide.html#writing-documentation-running-docs-ci).
- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html).
+- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html) and [Accuracy Results](https://docs.sglang.ai/references/accuracy_evaluation.html).
--- a/docs/references/accuracy_evaluation.md
+++ b/docs/references/accuracy_evaluation.md
 # Measuring Model Accuracy in SGLang

-This guide shows how to evaluate model accuracy using SGLang's [built-in benchmarks](https://github.com/sgl-project/sglang/tree/b045841baeff37a5601fcde23fa98bd09d942c36/benchmark).
+This guide shows how to evaluate model accuracy using SGLang's [built-in benchmarks](https://github.com/sgl-project/sglang/tree/b045841baeff37a5601fcde23fa98bd09d942c36/benchmark). Please include accuracy on crucial benchmarks in your PR if you make modifications on the model side, like the kernel and model architecture.

 ## Benchmarking Model Accuracy

@@ -47,7 +47,7 @@ def few_shot_gsm8k(s, question):
    )
 ```

-These adjustments give us the us the reported accuracy.
+These adjustments should return the desired accuracy.

 ## Extending Evaluation Capabilities