1. 10 Sep, 2024 1 commit
    • Malikeh Ehghaghi's avatar
      Add Open Arabic LLM Leaderboard Benchmarks (Full and Light Version) (#2232) · decc533d
      Malikeh Ehghaghi authored
      
      
      * arabic leaferboard yaml file is added
      
      * arabic toxigen is implemented
      
      * Dataset library is imported
      
      * arabic sciq is added
      
      * util file of arabic toxigen is updated
      
      * arabic race is added
      
      * arabic piqa is implemented
      
      * arabic open qa is added
      
      * arabic copa is implemented
      
      * arabic boolq ia added
      
      * arabic arc easy is added
      
      * arabic arc challenge is added
      
      * arabic exams benchmark is implemented
      
      * arabic hellaswag is added
      
      * arabic leaderboard yaml file metrics are updated
      
      * arabic mmlu benchmarks are added
      
      * arabic mmlu group yaml file is updated
      
      * alghafa benchmarks are added
      
      * acva benchmarks are added
      
      * acva utils.py is updated
      
      * light version of arabic leaderboard benchmarks are added
      
      * bugs fixed
      
      * bug fixed
      
      * bug fixed
      
      * bug fixed
      
      * bug fixed
      
      * bug fixed
      
      * library import bug is fixed
      
      * doc to target updated
      
      * bash file is deleted
      
      * results folder is deleted
      
      * leaderboard groups are added
      
      * full arabic leaderboard groups are added, plus some bug fixes to the light version
      
      * Create README.md
      
      README.md for arabic_leaderboard_complete
      
      * Create README.md
      
      README.md for arabic_leaderboard_light
      
      * Delete lm_eval/tasks/arabic_leaderboard directory
      
      * Update README.md
      
      * Update README.md
      
      adding the Arabic leaderboards to the library
      
      * Update README.md
      
      10% of the training set
      
      * Update README.md
      
      10% of the training set
      
      * revert .gitignore to prev version
      
      * Update lm_eval/tasks/README.md
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * updated main README.md
      
      * Update lm_eval/tasks/README.md
      
      * specify machine translated benchmarks (complete)
      
      * specify machine translated benchmarks (light version)
      
      * add alghafa to the related task names (complete and light)
      
      * add 'acva' to the related task names (complete and light)
      
      * add 'arabic_leaderboard' to all the groups (complete and light)
      
      * all dataset - not a random sample
      
      * added more accurate details to the readme file
      
      * added mt_mmlu from okapi
      
      * Update lm_eval/tasks/README.md
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/tasks/README.md
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * updated mt_mmlu readme
      
      * renaming 'alghafa' full and light
      
      * renaming 'arabic_mmlu' light and full
      
      * renaming 'acva' full and light
      
      * update readme and standardize dir/file names
      
      * running pre-commit
      
      ---------
      Co-authored-by: default avatarshahrzads <sayehban@ualberta.ca>
      Co-authored-by: default avatarshahrzads <56282669+shahrzads@users.noreply.github.com>
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      decc533d