Unverified Commit 2b2fa97b authored by Santiago Galiano Segura's avatar Santiago Galiano Segura Committed by GitHub
Browse files

add cocoteros_es dataset (#2721)


Co-authored-by: default avatarRobiert Sepulveda Torres <rsepulveda911112@gmail.com>
parent 2f403fa0
......@@ -15,6 +15,7 @@ The datasets included in SpanishBench that have been made public in previous pub
| Task | Category | Paper title | Homepage |
|:-------------:|:-----:|:-------------:|:-----:|
| Belebele_es | Reading Comprehension | [The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants](https://arxiv.org/abs/2308.16884) | https://huggingface.co/datasets/facebook/belebele |
| Cocoteros_es | Commonsense Reasoning | [COCOTEROS: A Spanish Corpus with Contextual Knowledge for Natural Language Generation](https://besaya.infor.uva.es/sepln24/paper04.pdf) | https://huggingface.co/datasets/gplsi/cocoteros |
| EsCoLA | Linguistic Acceptability | [EsCoLA: Spanish Corpus of Linguistic Acceptability](https://aclanthology.org/2024.lrec-main.554/) | https://huggingface.co/datasets/nbel/EsCoLA |
| FLORES_es | Translation | [The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation](https://arxiv.org/abs/2106.03193) | https://huggingface.co/datasets/facebook/flores |
| MGSM_es | Math | [Language Models are Multilingual Chain-of-Thought Reasoners](https://arxiv.org/abs/2210.03057) | https://huggingface.co/datasets/juletxara/mgsm |
......@@ -77,6 +78,7 @@ The datasets included in SpanishBench that have been made public in previous pub
The following tasks evaluate tasks on SpanishBench dataset using various scoring methods.
- `belebele_spa_Latn`
- `cocoteros_es`
- `copa_es`
- `escola`
- `flores_es`
......
task: cocoteros_es
dataset_path: gplsi/cocoteros
dataset_name: null
output_type: generate_until
doc_to_text: "Genera una frase corta con estas palabras: {{keywords}}. El contexto es: {{context}} \n\nRespuesta:"
doc_to_target: "{{text}}"
training_split: train
test_split: test
target_delimiter: ' '
generation_kwargs:
max_gen_toks: 40
until:
- "\n"
metric_list:
- metric: bleu
aggregation: bleu
higher_is_better: true
- metric: !function utils.rouge1
aggregation: !function utils.rouge1_agg
higher_is_better: true
metadata:
version: 1.0
......@@ -13,5 +13,6 @@ task:
- mgsm_direct_es_spanish_bench
- flores_es
- phrases_es
- cocoteros_es
metadata:
version: 1.0
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment