Unverified Commit 2f403fa0 authored by Naiara Perez's avatar Naiara Perez Committed by GitHub
Browse files

add Basque translation of ARC and PAWS to BasqueBench (#2732)



* add Basque translation of ARC and PAWS to BasqueBench

* pre-commit

---------
Co-authored-by: default avatarBaber <baber@hey.com>
parent 01849b40
......@@ -5,14 +5,16 @@
BasqueBench is a benchmark for evaluating language models in Basque tasks. This is, it evaluates the ability of a language model to understand and generate Basque text. BasqueBench offers a combination of pre-existing, open datasets and datasets developed exclusivelly for this benchmark. All the details of BasqueBench will be published in a paper soon.
The new evaluation datasets included in BasqueBench are:
| Task | Category | Homepage |
|:-------------:|:-----:|:-----:|
| MGSM_eu | Math | https://huggingface.co/datasets/HiTZ/MGSM-eu |
| PIQA_eu | Question Answering | https://huggingface.co/datasets/HiTZ/PIQA-eu |
| WNLI_eu | Natural Language Inference | https://huggingface.co/datasets/HiTZ/wnli-eu |
| XCOPA_eu | Commonsense Reasoning | https://huggingface.co/datasets/HiTZ/XCOPA-eu |
| Task | Category | Homepage |
|:--------:|:--------------------------:|:---------------------------------------------:|
| ARC_eu | Question Answering | https://huggingface.co/datasets/HiTZ/ARC-eu |
| MGSM_eu | Math | https://huggingface.co/datasets/HiTZ/MGSM-eu |
| PAWS_eu | Paraphrasing | https://huggingface.co/datasets/HiTZ/PAWS-eu |
| PIQA_eu | Question Answering | https://huggingface.co/datasets/HiTZ/PIQA-eu |
| WNLI_eu | Natural Language Inference | https://huggingface.co/datasets/HiTZ/WNLI-eu |
| XCOPA_eu | Commonsense Reasoning | https://huggingface.co/datasets/HiTZ/XCOPA-eu |
The datasets included in BasqueBench that have been made public in previous pubications are:
The datasets included in BasqueBench that have been made public in previous publications are:
| Task | Category | Paper title | Homepage |
|:-------------:|:-----:|:-------------:|:-----:|
......@@ -73,6 +75,8 @@ The datasets included in BasqueBench that have been made public in previous pubi
#### Tasks
The following tasks evaluate tasks on BasqueBench dataset using various scoring methods.
- `arc_eu_challenge`
- `arc_eu_easy`
- `belebele_eus_Latn`
- `eus_exams_eu`
- `eus_proficiency`
......@@ -97,6 +101,7 @@ The following tasks evaluate tasks on BasqueBench dataset using various scoring
- `flores_pt-eu`
- `mgsm_direct_eu`
- `mgsm_native_cot_eu`
- `paws_eu`
- `piqa_eu`
- `qnlieu`
- `wnli_eu`
......
include: arc_eu_easy.yaml
task: arc_eu_challenge
dataset_name: ARC-Challenge
task: arc_eu_easy
dataset_path: HiTZ/ARC-eu
dataset_name: ARC-Easy
output_type: multiple_choice
training_split: null
validation_split: validation
test_split: test
doc_to_text: "Galdera: {{question}}\nErantzuna:"
doc_to_target: "{{choices.label.index(answerKey)}}"
doc_to_choice: "{{choices.text}}"
should_decontaminate: true
doc_to_decontamination_query: "Galdera: {{question}}\nErantzuna:"
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
- metric: acc_norm
aggregation: mean
higher_is_better: true
metadata:
version: 1.0
group: basque_bench
task:
- arc_eu_challenge
- arc_eu_easy
- belebele_eus_Latn
- xstorycloze_eu
- flores_eu
......@@ -14,6 +16,7 @@ task:
- xcopa_eu
- mgsm_direct_eu
- mgsm_native_cot_eu
- paws_eu
- piqa_eu
metadata:
version: 1.0
task: paws_eu
dataset_path: HiTZ/PAWS-eu
dataset_name: null
output_type: multiple_choice
test_split: test
process_docs: !function utils.paws_process_docs
doc_to_text: ''
doc_to_target: label
doc_to_choice: '{{[sentence1+", ezta? Ez, "+sentence2, sentence1+", ezta? Bai, "+sentence2]}}'
target_delimiter: ''
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 1.0
from functools import partial
# ~~~~~~~~~~~ XCOPA ~~~~~~~~~~~ #
xcopa_connectors = {"cause": " Izan ere,", "effect": " Beraz,"}
......@@ -18,4 +15,28 @@ def xcopa_doc_to_choice(doc):
return [convert_choice(doc["choice1"]), convert_choice(doc["choice2"])]
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #
# ~~~~~~~~~~~ PAWS-X ~~~~~~~~~~~ #
def paws_process_docs(dataset):
empty_docs = []
def _process_doc(doc):
if doc["sentence1"] not in [None, ""] and doc["sentence2"] not in [None, ""]:
# Remove final punctuation mark in the first sentence
if doc["sentence1"].endswith((".", ",", ";")):
doc["sentence1"] = doc["sentence1"][:-1]
# Start the second sentence in lowercase (to be used after "Yes, ...")
doc["sentence2"] = lowercase_first_letter(doc["sentence2"])
return doc
else:
empty_docs.append(doc)
return doc
def lowercase_first_letter(text):
return text[0].lower() + text[1:]
return dataset.filter(
lambda doc: doc["sentence1"] not in [None, ""]
and doc["sentence2"] not in [None, ""]
).map(_process_doc)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment