update in task guide

db0c46b0 · lintangsutawika · 22ece14a · db0c46b0
Commit db0c46b0 authored Aug 13, 2023 by lintangsutawika
Hide whitespace changes
Inline Side-by-side

Showing with 9 additions and 0 deletions

docs/advanced_task_guide.md docs/advanced_task_guide.md +9 -0

No files found.
--- a/docs/advanced_task_guide.md
+++ b/docs/advanced_task_guide.md
@@ -236,3 +236,12 @@ Generative tasks:

 Tasks using complex filtering:
 - GSM8k with CoT (+ with Self-Consistency): (`lm_eval/tasks/gsm8k/gsm8k-cot.yaml` ; `lm_eval/tasks/gsm8k/gsm8k-cot-self-consistency.yaml`)
+
+
+## Benchmarks
+
+When evaluating a language model, it's is not unusual to test accross a number of tasks that may not be related to one another in order to assess a variety of capabilities. To this end, it may be combursome to have to list the set of tasks or add a new group name to each yaml of each individual task.
+
+To solve this, we can create a benchmark yaml config. This is a config that contains the names of the tasks that should be included in a particular benchmark. The config consists of two main keys `group` which denotes the name of the benchmark and `task` which is where we can list the tasks.
+
+Calling the benchmark is done the same way we would call any task with `--tasks`. Benchmarks can be added in `lm_eval/benchmarks/`
\ No newline at end of file