@@ -33,6 +33,7 @@ The datasets included in CatalanBench that have been made public in previous pub
...
@@ -33,6 +33,7 @@ The datasets included in CatalanBench that have been made public in previous pub
| VeritasQA_ca | Truthfulness | VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability | TBA |
| VeritasQA_ca | Truthfulness | VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability | TBA |
| WNLI-ca | Natural Language Inference | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/wnli-ca |
| WNLI-ca | Natural Language Inference | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/wnli-ca |
| XNLI-ca | Natural Language Inference | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/xnli-ca |
| XNLI-ca | Natural Language Inference | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/xnli-ca |
| XNLI-va | Natural Language Inference | Building a Data Infrastructure for a Mid-Resource Language: The Case of Valencian | https://huggingface.co/datasets/gplsi/xnli_va |
| XQuAD-ca | Question Answering | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/xquad-ca |
| XQuAD-ca | Question Answering | [Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan](https://aclanthology.org/2024.lrec-main.231/) | https://huggingface.co/datasets/projecte-aina/xquad-ca |
...
@@ -126,6 +127,7 @@ The following tasks evaluate tasks on CatalanBench dataset using various scoring
...
@@ -126,6 +127,7 @@ The following tasks evaluate tasks on CatalanBench dataset using various scoring
-`veritasqa_mc2_ca`
-`veritasqa_mc2_ca`
-`wnli_ca`
-`wnli_ca`
-`xnli_ca`
-`xnli_ca`
-`xnli_va`
-`xquad_ca`
-`xquad_ca`
-`xstorycloze_ca`
-`xstorycloze_ca`
...
@@ -148,3 +150,4 @@ If other tasks on this dataset are already supported:
...
@@ -148,3 +150,4 @@ If other tasks on this dataset are already supported:
### Changelog
### Changelog
version 2.0: (2025-Mar-18) add [`cococteros_va`](./cocoteros_va.yaml) task.
version 2.0: (2025-Mar-18) add [`cococteros_va`](./cocoteros_va.yaml) task.
version 2.1: (2025-Jul-30) add [`xnli_va`](./xnli_va.yaml) task.