Unverified Commit 18297993 authored by Jess's avatar Jess Committed by GitHub
Browse files

AfroBench: How Good are Large Language Models on African Languages? (#2825)



* add afrixnli to task

* add chat completion

* remove chat completion -untested

* afrimmlu added

* afrimmlu folder update

* afrimmlu folder update

* updated prompt

* remove print

* add afrimgsm -direct

* add squad metric

* fix bash script

* remove direct util, update common yaml

* remove print

* add few show. metric fixes

* fix direct path, add bash script for gpt models

* added transate test

* update afrixnli tasks

* update afrixnli tasks

* update metrics for afrixnli

* prompt translations fix

* prompt translations fix

* filter and metric fix -mgsm

* remove squad metric

* remove squad metric

* add f1 score to mgsm

* add f1 score to mgsm

* update native-direct with lin

* change f1 function

* add lin to utils

* add utils

* remove test limit

* remove test configs

* add swahili to mmlu

* change eng to ewe in ewe yaml mmlu

* add squad metric to mgsm, remove whitespace filter

* added translate test

* added afrixnli_translate

* fix exact match valueError

* fix exact match valueError

* restructure mmlu folder

* spacing

* remove afrimmlu_translate folder

* add utility

* format task name, clean ups

* modefied mgsm

* update on afrimgsm

* update on afrimgsm

* removed utils

* other mgsm varieties

* other mgsm varieties

* adding trasnslate direct

* Update translate_direct_yaml

* add manual xnli prompt, add multichoice for openai models, and adapt multichoice metric for openai model

* edit for open models

* Update translate_direct_yaml

* add verbalizer for xnli

* change xnli from multiple choice to generate

* add manual accuracy scores

* revert xnli to multiple choice

* change afrimgsm utils

* revert xnli to multiple_choice

* cleanups and readmes

* remove openai fixes and unused regex

* pr review changes

* revert metrics.py, task.py and extraction.py to main version

* add afrisenti

* utilities

* pulled from main

* add afrixnli

* add afrimmlu

* update afrixnli prompts

* mising senti language

* fix afrisenti prompt 2

* fix afrisenti prompts

* fix afrisenti prompts

* configure task grouping

* add multiple prompts to afrixnli for irokobench

* add multiple prompts to afrimmlu for irokobench

* Update afrixnli_yaml

* fixes and moves

* fixes and moves

* afrimmlu multiple prompts configs

* remove validation set from afrimmlu

* remove eng from afrimmlu translate test

* correct dataset path

* multiple prompts for mgsm

* file restructure

* afribench grouping

* repo restructuring

* repo restructuring

* update exact match to hugging face exact match and add new mgsm language

* remove decontamination

* update generation kwargs

* update generation kwargs for all mgsm prompts

* remove lang

* update generation kwargs for afrimgsm translatetest

* add afrimgsm cot for direct and translate

* remove eng from translate-cot

* add masakhaPOS tasks

* remove changes from task script

* add masakhanews tasks

* add uhura arc easy

* add afriqa and belebele files

* add tags for easier run. add naija rc

* add new metrics and transformation scripts

* fix afriqa swa fewshot split

* add naijarc

* add afrobench lite tasks

* update afrobench

* update afrobench

* remove unverified files to avoid bugs

* remove files not needed

* add afrobench tasks

* add afrobench tasks

* change to version 1

* change to version 1

* update afrobench

* update afrobench

* restore metric to original script

* update readme instructions

* add individual dataset readmes

* add link to collections

* correct run script

* align with main

* align with main

* align with main

* align with main

* align with main

* align with main

* align with main

* align with main

* failed run fixes

* failed run fixes

* add afrimgsm cot

* Apply precommit fixes

* update mafand dataset name

* pull request fixes

* remove afrihate due to availability

---------
Co-authored-by: default avatarIsrael Abebe Azime <azime@cg.uni-saarland.de>
Co-authored-by: default avatarIsrael Abebe Azime <se.israel.abebe@gmail.com>
Co-authored-by: default avatarDavid Adelani <davlanade@gmail.com>
Co-authored-by: default avatartheyorubayesian <akin.o.oladipo@gmail.com>
parent cf51e699
# Generated by utils.py
dataset_name: bbj
doc_to_text: 'This text is in Gbomala. Restore all diacritical marks to their proper
places in the following sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_bbj_prompt_3
# Generated by utils.py
dataset_name: fon
doc_to_text: 'This text is in Fon. Restore all diacritical marks to their proper places
in the following sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_fon_prompt_3
# Generated by utils.py
dataset_name: ibo
doc_to_text: 'This text is in Igbo. Restore all diacritical marks to their proper
places in the following sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_ibo_prompt_3
# Generated by utils.py
dataset_name: wol
doc_to_text: 'This text is in Wolof. Restore all diacritical marks to their proper
places in the following sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_wol_prompt_3
tag:
- adr_tasks
- adr_prompt_3
dataset_path: masakhane/diacritics-restoration
dataset_kwargs: {trust_remote_code: True}
doc_to_target: target
output_type: generate_until
fewshot_split: dev
test_split: test
training_split: train
metric_list:
- metric: bleu
aggregation: bleu
higher_is_better: true
- metric: chrf
aggregation: chrf
higher_is_better: true
generation_kwargs:
do_sample: false
until:
- '<eos>'
- </s>
- <|im_end|>
metadata:
version: 1.0
# Generated by utils.py
dataset_name: yor
doc_to_text: 'This text is in Yoruba. Restore all diacritical marks to their proper
places in the following sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_yor_prompt_3
# Generated by utils.py
dataset_name: bbj
doc_to_text: 'You are a linguist specializing in diacritical marks for Gbomala. Add
the appropriate diacritics to this Gbomala sentence: {{text}}. Return output sentence
only'
include: afridiacritics_yaml
task: afridiacritics_bbj_prompt_4
# Generated by utils.py
dataset_name: fon
doc_to_text: 'You are a linguist specializing in diacritical marks for Fon. Add the
appropriate diacritics to this Fon sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_fon_prompt_4
# Generated by utils.py
dataset_name: ibo
doc_to_text: 'You are a linguist specializing in diacritical marks for Igbo. Add the
appropriate diacritics to this Igbo sentence: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_ibo_prompt_4
# Generated by utils.py
dataset_name: wol
doc_to_text: 'You are a linguist specializing in diacritical marks for Wolof. Add
the appropriate diacritics to this Wolof sentence: {{text}}. Return output sentence
only'
include: afridiacritics_yaml
task: afridiacritics_wol_prompt_4
tag:
- adr_tasks
- adr_prompt_4
dataset_path: masakhane/diacritics-restoration
dataset_kwargs: {trust_remote_code: True}
doc_to_target: target
output_type: generate_until
fewshot_split: dev
test_split: test
training_split: train
metric_list:
- metric: bleu
aggregation: bleu
higher_is_better: true
- metric: chrf
aggregation: chrf
higher_is_better: true
generation_kwargs:
do_sample: false
until:
- '<eos>'
- </s>
- <|im_end|>
metadata:
version: 1.0
# Generated by utils.py
dataset_name: yor
doc_to_text: 'You are a linguist specializing in diacritical marks for Yoruba. Add
the appropriate diacritics to this Yoruba sentence: {{text}}. Return output sentence
only'
include: afridiacritics_yaml
task: afridiacritics_yor_prompt_4
# Generated by utils.py
dataset_name: bbj
doc_to_text: 'You are a linguist specializing in diacritical marks for Gbomala. Diacritics
are essential for proper pronunciation and meaning in Gbomala. You are tasked with
converting Gbomala sentences without diacritics into their correctly accented forms.
Here''s the input: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_bbj_prompt_5
# Generated by utils.py
dataset_name: fon
doc_to_text: 'You are a linguist specializing in diacritical marks for Fon. Diacritics
are essential for proper pronunciation and meaning in Fon. You are tasked with converting
Fon sentences without diacritics into their correctly accented forms. Here''s the
input: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_fon_prompt_5
# Generated by utils.py
dataset_name: ibo
doc_to_text: 'You are a linguist specializing in diacritical marks for Igbo. Diacritics
are essential for proper pronunciation and meaning in Igbo. You are tasked with
converting Igbo sentences without diacritics into their correctly accented forms.
Here''s the input: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_ibo_prompt_5
# Generated by utils.py
dataset_name: wol
doc_to_text: 'You are a linguist specializing in diacritical marks for Wolof. Diacritics
are essential for proper pronunciation and meaning in Wolof. You are tasked with
converting Wolof sentences without diacritics into their correctly accented forms.
Here''s the input: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_wol_prompt_5
tag:
- adr_tasks
- adr_prompt_5
dataset_path: masakhane/diacritics-restoration
dataset_kwargs: {trust_remote_code: True}
doc_to_target: target
output_type: generate_until
fewshot_split: dev
test_split: test
training_split: train
metric_list:
- metric: bleu
aggregation: bleu
higher_is_better: true
- metric: chrf
aggregation: chrf
higher_is_better: true
generation_kwargs:
do_sample: false
until:
- '<eos>'
- </s>
- <|im_end|>
metadata:
version: 1.0
# Generated by utils.py
dataset_name: yor
doc_to_text: 'You are a linguist specializing in diacritical marks for Yoruba. Diacritics
are essential for proper pronunciation and meaning in Yoruba. You are tasked with
converting Yoruba sentences without diacritics into their correctly accented forms.
Here''s the input: {{text}}. Return output sentence only'
include: afridiacritics_yaml
task: afridiacritics_yor_prompt_5
#
## Paper
Title: `AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages`
Paper Link: https://arxiv.org/abs/2305.06897
## Abstract
>AfriQA is the first cross-lingual question answering (QA) dataset with a focus on African languages. The dataset includes over 12,000 XOR QA examples across 10 African languages, making it an invaluable resource for developing more equitable QA technology. African languages have historically been underserved in the digital landscape, with far less in-language content available online. This makes it difficult for QA systems to provide accurate information to users in their native language. However, cross-lingual open-retrieval question answering (XOR QA) systems can help fill this gap by retrieving answer content from other languages. AfriQA focuses specifically on African languages where cross-lingual answer content is the only high-coverage source of information. Previous datasets have primarily focused on languages where cross-lingual QA augments coverage from the target language, but AfriQA highlights the importance of African languages as a realistic use case for XOR QA.
HomePage: https://github.com/masakhane-io/afriqa
### Citation
```
@misc{ogundepo2023afriqa,
title={AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages},
author={Odunayo Ogundepo and Tajuddeen R. Gwadabe and Clara E. Rivera and Jonathan H. Clark and Sebastian Ruder and David Ifeoluwa Adelani and Bonaventure F. P. Dossou and Abdou Aziz DIOP and Claytone Sikasote and Gilles Hacheme and Happy Buzaaba and Ignatius Ezeani and Rooweither Mabuya and Salomey Osei and Chris Emezue and Albert Njoroge Kahira and Shamsuddeen H. Muhammad and Akintunde Oladipo and Abraham Toluwase Owodunni and Atnafu Lambebo Tonja and Iyanuoluwa Shode and Akari Asai and Tunde Oluwaseyi Ajayi and Clemencia Siro and Steven Arthur and Mofetoluwa Adeyemi and Orevaoghene Ahia and Aremu Anuoluwapo and Oyinkansola Awosan and Chiamaka Chukwuneke and Bernard Opoku and Awokoya Ayodele and Verrah Otiende and Christine Mwase and Boyd Sinkala and Andre Niyongabo Rubungo and Daniel A. Ajisafe and Emeka Felix Onwuegbuzia and Habib Mbow and Emile Niyomutabazi and Eunice Mukonde and Falalu Ibrahim Lawan and Ibrahim Said Ahmad and Jesujoba O. Alabi and Martin Namukombo and Mbonu Chinedu and Mofya Phiri and Neo Putini and Ndumiso Mngoma and Priscilla A. Amuok and Ruqayya Nasir Iro and Sonia Adhiambo},
year={2023},
eprint={2305.06897},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
group: afriqa
task:
- afriqa_prompt_1
- afriqa_prompt_2
- afriqa_prompt_3
- afriqa_prompt_4
- afriqa_prompt_5
aggregate_metric_list:
- metric: acc
aggregation: mean
weight_by_size: true
metadata:
version: 1
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment