• Malikeh Ehghaghi's avatar
    Add Open Arabic LLM Leaderboard Benchmarks (Full and Light Version) (#2232) · decc533d
    Malikeh Ehghaghi authored
    
    
    * arabic leaferboard yaml file is added
    
    * arabic toxigen is implemented
    
    * Dataset library is imported
    
    * arabic sciq is added
    
    * util file of arabic toxigen is updated
    
    * arabic race is added
    
    * arabic piqa is implemented
    
    * arabic open qa is added
    
    * arabic copa is implemented
    
    * arabic boolq ia added
    
    * arabic arc easy is added
    
    * arabic arc challenge is added
    
    * arabic exams benchmark is implemented
    
    * arabic hellaswag is added
    
    * arabic leaderboard yaml file metrics are updated
    
    * arabic mmlu benchmarks are added
    
    * arabic mmlu group yaml file is updated
    
    * alghafa benchmarks are added
    
    * acva benchmarks are added
    
    * acva utils.py is updated
    
    * light version of arabic leaderboard benchmarks are added
    
    * bugs fixed
    
    * bug fixed
    
    * bug fixed
    
    * bug fixed
    
    * bug fixed
    
    * bug fixed
    
    * library import bug is fixed
    
    * doc to target updated
    
    * bash file is deleted
    
    * results folder is deleted
    
    * leaderboard groups are added
    
    * full arabic leaderboard groups are added, plus some bug fixes to the light version
    
    * Create README.md
    
    README.md for arabic_leaderboard_complete
    
    * Create README.md
    
    README.md for arabic_leaderboard_light
    
    * Delete lm_eval/tasks/arabic_leaderboard directory
    
    * Update README.md
    
    * Update README.md
    
    adding the Arabic leaderboards to the library
    
    * Update README.md
    
    10% of the training set
    
    * Update README.md
    
    10% of the training set
    
    * revert .gitignore to prev version
    
    * Update lm_eval/tasks/README.md
    Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
    
    * updated main README.md
    
    * Update lm_eval/tasks/README.md
    
    * specify machine translated benchmarks (complete)
    
    * specify machine translated benchmarks (light version)
    
    * add alghafa to the related task names (complete and light)
    
    * add 'acva' to the related task names (complete and light)
    
    * add 'arabic_leaderboard' to all the groups (complete and light)
    
    * all dataset - not a random sample
    
    * added more accurate details to the readme file
    
    * added mt_mmlu from okapi
    
    * Update lm_eval/tasks/README.md
    Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
    
    * Update lm_eval/tasks/README.md
    Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
    
    * updated mt_mmlu readme
    
    * renaming 'alghafa' full and light
    
    * renaming 'arabic_mmlu' light and full
    
    * renaming 'acva' full and light
    
    * update readme and standardize dir/file names
    
    * running pre-commit
    
    ---------
    Co-authored-by: default avatarshahrzads <sayehban@ualberta.ca>
    Co-authored-by: default avatarshahrzads <56282669+shahrzads@users.noreply.github.com>
    Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
    decc533d
README.md 16.3 KB