Unverified Commit 558d0d71 authored by Baber Abbasi's avatar Baber Abbasi Committed by GitHub
Browse files

mmlu-pro: add newlines to task descriptions (not leaderboard) (#2334)



* add newlines to task descriptions; increment versions

* fix task tests (with groups)

* Apply suggestions from code review

---------
Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
parent 7d242381
......@@ -57,3 +57,8 @@ If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
### Changelog
* (tasks, group) 2024-09-23 -- (version 1 --> version 2)
* Added one newline to task description(s) as per [reference implementation](https://github.com/TIGER-AI-Lab/MMLU-Pro/blob/47b9891aacb8bd7cda29d5c5ba17b9434dd333bc/evaluate_from_local.py#L93)
......@@ -30,4 +30,4 @@ metric_list:
ignore_case: true
ignore_punctuation: true
metadata:
version: 0.0
version: 1.0
......@@ -20,4 +20,4 @@ aggregate_metric_list:
weight_by_size: true
filter_list: custom-extract
metadata:
version: 1.0
version: 2.0
description: "The following are multiple choice questions (with answers) about biology. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about biology. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_biology"
task_alias: "biology"
......
description: "The following are multiple choice questions (with answers) about business. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about business. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_business"
task_alias: "business"
......
description: "The following are multiple choice questions (with answers) about chemistry. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about chemistry. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_chemistry"
task_alias: "chemistry"
......
description: "The following are multiple choice questions (with answers) about computer science. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about computer science. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_computer_science"
task_alias: "computer_science"
......
description: "The following are multiple choice questions (with answers) about economics. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about economics. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_economics"
task_alias: "economics"
......
description: "The following are multiple choice questions (with answers) about engineering. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about engineering. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_engineering"
task_alias: "engineering"
......
description: "The following are multiple choice questions (with answers) about health. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about health. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_health"
task_alias: "health"
......
description: "The following are multiple choice questions (with answers) about history. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about history. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_history"
task_alias: "history"
......
description: "The following are multiple choice questions (with answers) about law. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about law. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_law"
task_alias: "law"
......
description: "The following are multiple choice questions (with answers) about math. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about math. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_math"
task_alias: "math"
......
description: "The following are multiple choice questions (with answers) about other. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about other. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_other"
task_alias: "other"
......
description: "The following are multiple choice questions (with answers) about philosophy. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about philosophy. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_philosophy"
task_alias: "philosophy"
......
description: "The following are multiple choice questions (with answers) about physics. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about physics. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_physics"
task_alias: "physics"
......
description: "The following are multiple choice questions (with answers) about psychology. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice."
description: "The following are multiple choice questions (with answers) about psychology. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice.\n"
include: "_default_template_yaml"
task: "mmlu_pro_psychology"
task_alias: "psychology"
......
......@@ -5,6 +5,7 @@ import pytest
import lm_eval.tasks as tasks
from lm_eval.api.task import ConfigurableTask
from lm_eval.evaluator_utils import get_task_list
from .utils import new_tasks
......@@ -20,10 +21,11 @@ def task_class():
# CI: new_tasks checks if any modifications have been made
task_classes = new_tasks()
# Check if task_classes is empty
if task_classes:
return list(task_manager.load_task_or_group(task_classes).values())
else:
return list(task_manager.load_task_or_group(TASKS).values())
task_classes = task_classes if task_classes else TASKS
res = tasks.get_task_dict(task_classes, task_manager)
res = [x.task for x in get_task_list(res)]
return res
@pytest.fixture()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment