Unverified Commit 3a8f1e44 authored by Jess's avatar Jess Committed by GitHub
Browse files

Merge branch 'EleutherAI:main' into main

parents cc58abec 1980a13c
...@@ -50,6 +50,10 @@ This mode supports a number of command-line arguments, the details of which can ...@@ -50,6 +50,10 @@ This mode supports a number of command-line arguments, the details of which can
* `--wandb_args`: Tracks logging to Weights and Biases for evaluation runs and includes args passed to `wandb.init`, such as `project` and `job_type`. Full list (here.)[https://docs.wandb.ai/ref/python/init]. e.g., ```--wandb_args project=test-project,name=test-run``` * `--wandb_args`: Tracks logging to Weights and Biases for evaluation runs and includes args passed to `wandb.init`, such as `project` and `job_type`. Full list (here.)[https://docs.wandb.ai/ref/python/init]. e.g., ```--wandb_args project=test-project,name=test-run```
* `--hf_hub_log_args`: To push results and samples to the Hugging Face Hub. First ensure an access token with write access is set in the `HF_TOKEN` environment variable. Then, use this flag to specify the organization, repository name, repository visibility, and whether to push results and samples to the Hub. e.g., ```--hf_hub_log_args hub_results_org=EleutherAI,hub_repo_name=lm-eval-results,public_repo=False,push_samples_to_hub=True```
## External Library Usage ## External Library Usage
We also support using the library's external API for use within model training loops or other scripts. We also support using the library's external API for use within model training loops or other scripts.
......
# COPAL
### Paper
Title: `COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances`
Abstract: `https://arxiv.org/abs/2311.01012`
`COPAL-ID is an Indonesian causal commonsense reasoning dataset that captures local nuances. It provides a more natural portrayal of day-to-day causal reasoning within the Indonesian (especially Jakartan) cultural sphere. Professionally written and validatid from scratch by natives, COPAL-ID is more fluent and free from awkward phrases, unlike the translated XCOPA-ID.`
Homepage: `https://github.com/haryoa/copal-id`
### Citation
```
@article{wibowo2023copal,
title={COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances},
author={Wibowo, Haryo Akbarianto and Fuadi, Erland Hilman and Nityasya, Made Nindyatama and Prasojo, Radityo Eko and Aji, Alham Fikri},
journal={arXiv preprint arXiv:2311.01012},
year={2023}
}
```
### Groups and Tasks
#### Groups
* `copal_id`
#### Tasks
* `copal_id_standard`: `Standard version of COPAL dataset, use formal language and less local nuances`
* `copal_id_colloquial`: `Colloquial version of COPAL dataset, use informal language and more local nuances`
### Checklist
For adding novel benchmarks/datasets to the library:
* [x] Is the task an existing benchmark in the literature?
* [x] Have you referenced the original paper that introduced the task?
* [x] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
include: standard.yaml
task: copal_id_colloquial
task_alias: colloquial
test_split: test_colloquial
group: copal_id
task: copal_id_standard
task_alias: standard
dataset_path: haryoaw/COPAL
dataset_name: id
output_type: multiple_choice
test_split: test
doc_to_text: !function utils.doc_to_text_id
doc_to_target: label
doc_to_choice: !function utils.doc_to_choice
metric_list:
- metric: acc
metadata:
version: 1.0
from functools import partial
def convert_choice(choice):
return choice[0].lower() + choice[1:]
def doc_to_text(doc, connector):
conn = connector[doc["question"]]
return doc["premise"].strip()[:-1] + f" {conn}"
def doc_to_choice(doc):
return [convert_choice(doc["choice1"]), convert_choice(doc["choice2"])]
doc_to_text_id = partial(
doc_to_text,
connector={
"cause": "karena",
"effect": "maka",
},
)
dataset_path: hails/mmlu_no_train # a copy of `cais/mmlu` with no auxiliary_train split
output_type: multiple_choice
test_split: test
fewshot_split: dev
fewshot_config:
sampler: first_n
doc_to_text: "Question: {{question.strip()}}\nAnswer:"
doc_to_choice: "{{choices}}"
doc_to_target: "{{answer}}"
metadata:
version: 0.0
group: mmlu_continuation
task:
- mmlu_continuation_stem
- mmlu_continuation_other
- mmlu_continuation_social_sciences
- mmlu_continuation_humanities
"dataset_name": "abstract_algebra"
"description": "The following are questions (with answers) about abstract\
\ algebra.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_abstract_algebra"
"dataset_name": "anatomy"
"description": "The following are questions (with answers) about anatomy.\n\
\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_anatomy"
"dataset_name": "astronomy"
"description": "The following are questions (with answers) about astronomy.\n\
\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_astronomy"
"dataset_name": "business_ethics"
"description": "The following are questions (with answers) about business\
\ ethics.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_business_ethics"
"dataset_name": "clinical_knowledge"
"description": "The following are questions (with answers) about clinical\
\ knowledge.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_clinical_knowledge"
"dataset_name": "college_biology"
"description": "The following are questions (with answers) about college\
\ biology.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_biology"
"dataset_name": "college_chemistry"
"description": "The following are questions (with answers) about college\
\ chemistry.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_chemistry"
"dataset_name": "college_computer_science"
"description": "The following are questions (with answers) about college\
\ computer science.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_computer_science"
"dataset_name": "college_mathematics"
"description": "The following are questions (with answers) about college\
\ mathematics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_mathematics"
"dataset_name": "college_medicine"
"description": "The following are questions (with answers) about college\
\ medicine.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_medicine"
"dataset_name": "college_physics"
"description": "The following are questions (with answers) about college\
\ physics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_physics"
"dataset_name": "computer_security"
"description": "The following are questions (with answers) about computer\
\ security.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_computer_security"
"dataset_name": "conceptual_physics"
"description": "The following are questions (with answers) about conceptual\
\ physics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_conceptual_physics"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment