Unverified Commit 03e7df51 authored by Lintang Sutawika's avatar Lintang Sutawika Committed by GitHub
Browse files

Allow parameter edits for registered tasks when listed in a benchmark (#1273)

* benchmark yamls allow minor edits of already registered tasks

* add documentation

* removed print
parent 39e7b264
......@@ -301,6 +301,23 @@ task:
- hendrycksTest*
```
It is also possible to list an existing task in your benchmark configuration with some adjustments. For example, a few tasks from mmlu is included `multimedqa`. There, the `task_alias` and `group_alias` (See [here](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/new_task_guide.md#beautifying-table-display) for more details) are modified to suit the benchmark.
```yaml
group: multimedqa
task:
- pubmedqa
- medmcqa
- medqa_4options
- task: mmlu_anatomy
task_alias: "anatomy (mmlu)"
group_alias: null
- task: mmlu_clinical_knowledge
task_alias: "clinical_knowledge (mmlu)"
group_alias: null
...
```
Alternatively, benchmarks can have tasks that are customizable for each task. They can be defined like how a yaml task is usually set.
```yaml
......
......@@ -61,11 +61,27 @@ def register_configurable_group(config: Dict[str, str], yaml_path: str = None) -
task_list = [task for task in all_task_list if type(task) == str]
for task_config in config_list:
base_config = {}
task_name_config = {}
if "task" in task_config:
task_name = task_config["task"]
if task_name in ALL_TASKS:
task_obj = get_task_dict(task_name)[task_name]
if type(task_obj) == tuple:
_, task_obj = task_obj
if task_obj is not None:
base_config = task_obj._config.to_dict()
task_name_config["task"] = f"{group}_{task_name}"
task_config = utils.load_yaml_config(yaml_path, task_config)
var_configs = check_prompt_config(
{
**base_config,
**task_config,
**{"group": group},
**task_name_config,
},
yaml_path=os.path.dirname(yaml_path),
)
......
......@@ -3,9 +3,21 @@ task:
- pubmedqa
- medmcqa
- medqa_4options
- mmlu_anatomy
- mmlu_clinical_knowledge
- mmlu_college_medicine
- mmlu_medical_genetics
- mmlu_professional_medicine
- mmlu_college_biology
- task: mmlu_anatomy
task_alias: "anatomy (mmlu)"
group_alias: null
- task: mmlu_clinical_knowledge
task_alias: "clinical_knowledge (mmlu)"
group_alias: null
- task: mmlu_college_medicine
task_alias: "college_medicine (mmlu)"
group_alias: null
- task: mmlu_medical_genetics
task_alias: "medical_genetics (mmlu)"
group_alias: null
- task: mmlu_professional_medicine
task_alias: "professional_medicine (mmlu)"
group_alias: null
- task: mmlu_college_biology
task_alias: "college_biology (mmlu)"
group_alias: null
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment