- 25 Sep, 2025 31 commits
-
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
- 22 Sep, 2025 1 commit
-
-
priverabsc authored
* Add eqbench tasks in Spanish and Catalan * Incremented catalan_bench and spanish_bench versions. Added 'multilingual' folder inside 'eq_bench' and moved the eqbench_ca and eqbench_es .yaml to that folder. Updated the tasks README with eqbench_es and eqbench_ca, expliciting inside each description both the Hugging Face link and the translation method. * Fixed tasks table. * remove test_task.sh and results folder * Add utils.py to multilingual folder
-
- 21 Sep, 2025 6 commits
-
-
its-alpesh authored
* Add humaneval_infilling task * pacify pre-commit --------- Co-authored-by:Baber Abbasi <92168766+baberabb@users.noreply.github.com>
-
Janna authored
* register aime * lint --------- Co-authored-by:Baber <baber@hey.com>
-
Janna authored
* create babilong tasks * lint * add clarification * fix typo * add babilong description
-
Luis Cosio authored
* Added benchmark * Added more testing * Added task definition for mmlu_redux and mmlu_redux_spanish * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add remaining MMLU Redux YAMLs and updated tasks README * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add MMLU Redux changes from pr-2705 * Resolve pre-commit hook and pytest overlapping group issues by adding mmlu_redux_spanish task entries and unique subgroup names * Enhance retry logic to prevent 429 error when using Hugging Face API for tests, apply pre-commit fixes * Revert python test changes and comments one task group to avoid Hugging Face rate limit and task failure --------- Co-authored-by:CT-6282 <ricardo.godric@hotmail.com>
-
kaixuanliu authored
Signed-off-by:Liu, Kaixuan <kaixuan.liu@intel.com>
-
Timur Aysin authored
* fix: set 'do_sample=False' and use double quotes in 'doc_to_text' * feat: update versions and README for longbench * pacify pre-commit --------- Co-authored-by:Baber <baber@hey.com>
-
- 12 Sep, 2025 1 commit
-
-
fxmarty-amd authored
-
- 08 Sep, 2025 1 commit
-
-
Slim Frikha authored
* feat(vllm_causallms): make collator ignore seed when splitting batch into chunks * fix(collator): revert PR changes * fix(vllm-causallm): update collator call with groupby None * feat(sglang-causallms): make generation accept a list of sampling params --------- Co-authored-by:Baber <baber@hey.com>
-