- 25 Sep, 2025 34 commits
-
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
- 22 Sep, 2025 1 commit
-
-
priverabsc authored
* Add eqbench tasks in Spanish and Catalan * Incremented catalan_bench and spanish_bench versions. Added 'multilingual' folder inside 'eq_bench' and moved the eqbench_ca and eqbench_es .yaml to that folder. Updated the tasks README with eqbench_es and eqbench_ca, expliciting inside each description both the Hugging Face link and the translation method. * Fixed tasks table. * remove test_task.sh and results folder * Add utils.py to multilingual folder
-
- 21 Sep, 2025 5 commits
-
-
its-alpesh authored
* Add humaneval_infilling task * pacify pre-commit --------- Co-authored-by:Baber Abbasi <92168766+baberabb@users.noreply.github.com>
-
Janna authored
* register aime * lint --------- Co-authored-by:Baber <baber@hey.com>
-
Janna authored
* create babilong tasks * lint * add clarification * fix typo * add babilong description
-
Luis Cosio authored
* Added benchmark * Added more testing * Added task definition for mmlu_redux and mmlu_redux_spanish * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add remaining MMLU Redux YAMLs and updated tasks README * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add MMLU Redux changes from pr-2705 * Resolve pre-commit hook and pytest overlapping group issues by adding mmlu_redux_spanish task entries and unique subgroup names * Enhance retry logic to prevent 429 error when using Hugging Face API for tests, apply pre-commit fixes * Revert python test changes and comments one task group to avoid Hugging Face rate limit and task failure --------- Co-authored-by:CT-6282 <ricardo.godric@hotmail.com>
-
kaixuanliu authored
Signed-off-by:Liu, Kaixuan <kaixuan.liu@intel.com>
-