1. 15 May, 2025 1 commit
    • Jess's avatar
      AfroBench: How Good are Large Language Models on African Languages? (#2825) · 18297993
      Jess authored
      
      
      * add afrixnli to task
      
      * add chat completion
      
      * remove chat completion -untested
      
      * afrimmlu added
      
      * afrimmlu folder update
      
      * afrimmlu folder update
      
      * updated prompt
      
      * remove print
      
      * add afrimgsm -direct
      
      * add squad metric
      
      * fix bash script
      
      * remove direct util, update common yaml
      
      * remove print
      
      * add few show. metric fixes
      
      * fix direct path, add bash script for gpt models
      
      * added transate test
      
      * update afrixnli tasks
      
      * update afrixnli tasks
      
      * update metrics for afrixnli
      
      * prompt translations fix
      
      * prompt translations fix
      
      * filter and metric fix -mgsm
      
      * remove squad metric
      
      * remove squad metric
      
      * add f1 score to mgsm
      
      * add f1 score to mgsm
      
      * update native-direct with lin
      
      * change f1 function
      
      * add lin to utils
      
      * add utils
      
      * remove test limit
      
      * remove test configs
      
      * add swahili to mmlu
      
      * change eng to ewe in ewe yaml mmlu
      
      * add squad metric to mgsm, remove whitespace filter
      
      * added translate test
      
      * added afrixnli_translate
      
      * fix exact match valueError
      
      * fix exact match valueError
      
      * restructure mmlu folder
      
      * spacing
      
      * remove afrimmlu_translate folder
      
      * add utility
      
      * format task name, clean ups
      
      * modefied mgsm
      
      * update on afrimgsm
      
      * update on afrimgsm
      
      * removed utils
      
      * other mgsm varieties
      
      * other mgsm varieties
      
      * adding trasnslate direct
      
      * Update translate_direct_yaml
      
      * add manual xnli prompt, add multichoice for openai models, and adapt multichoice metric for openai model
      
      * edit for open models
      
      * Update translate_direct_yaml
      
      * add verbalizer for xnli
      
      * change xnli from multiple choice to generate
      
      * add manual accuracy scores
      
      * revert xnli to multiple choice
      
      * change afrimgsm utils
      
      * revert xnli to multiple_choice
      
      * cleanups and readmes
      
      * remove openai fixes and unused regex
      
      * pr review changes
      
      * revert metrics.py, task.py and extraction.py to main version
      
      * add afrisenti
      
      * utilities
      
      * pulled from main
      
      * add afrixnli
      
      * add afrimmlu
      
      * update afrixnli prompts
      
      * mising senti language
      
      * fix afrisenti prompt 2
      
      * fix afrisenti prompts
      
      * fix afrisenti prompts
      
      * configure task grouping
      
      * add multiple prompts to afrixnli for irokobench
      
      * add multiple prompts to afrimmlu for irokobench
      
      * Update afrixnli_yaml
      
      * fixes and moves
      
      * fixes and moves
      
      * afrimmlu multiple prompts configs
      
      * remove validation set from afrimmlu
      
      * remove eng from afrimmlu translate test
      
      * correct dataset path
      
      * multiple prompts for mgsm
      
      * file restructure
      
      * afribench grouping
      
      * repo restructuring
      
      * repo restructuring
      
      * update exact match to hugging face exact match and add new mgsm language
      
      * remove decontamination
      
      * update generation kwargs
      
      * update generation kwargs for all mgsm prompts
      
      * remove lang
      
      * update generation kwargs for afrimgsm translatetest
      
      * add afrimgsm cot for direct and translate
      
      * remove eng from translate-cot
      
      * add masakhaPOS tasks
      
      * remove changes from task script
      
      * add masakhanews tasks
      
      * add uhura arc easy
      
      * add afriqa and belebele files
      
      * add tags for easier run. add naija rc
      
      * add new metrics and transformation scripts
      
      * fix afriqa swa fewshot split
      
      * add naijarc
      
      * add afrobench lite tasks
      
      * update afrobench
      
      * update afrobench
      
      * remove unverified files to avoid bugs
      
      * remove files not needed
      
      * add afrobench tasks
      
      * add afrobench tasks
      
      * change to version 1
      
      * change to version 1
      
      * update afrobench
      
      * update afrobench
      
      * restore metric to original script
      
      * update readme instructions
      
      * add individual dataset readmes
      
      * add link to collections
      
      * correct run script
      
      * align with main
      
      * align with main
      
      * align with main
      
      * align with main
      
      * align with main
      
      * align with main
      
      * align with main
      
      * align with main
      
      * failed run fixes
      
      * failed run fixes
      
      * add afrimgsm cot
      
      * Apply precommit fixes
      
      * update mafand dataset name
      
      * pull request fixes
      
      * remove afrihate due to availability
      
      ---------
      Co-authored-by: default avatarIsrael Abebe Azime <azime@cg.uni-saarland.de>
      Co-authored-by: default avatarIsrael Abebe Azime <se.israel.abebe@gmail.com>
      Co-authored-by: default avatarDavid Adelani <davlanade@gmail.com>
      Co-authored-by: default avatartheyorubayesian <akin.o.oladipo@gmail.com>
      18297993