- 11 Oct, 2025 3 commits
- 16 Sep, 2025 3 commits
- 12 Sep, 2025 1 commit
-
-
fxmarty-amd authored
-
- 08 Sep, 2025 3 commits
-
-
Slim Frikha authored
* feat(vllm_causallms): make collator ignore seed when splitting batch into chunks * fix(collator): revert PR changes * fix(vllm-causallm): update collator call with groupby None * feat(sglang-causallms): make generation accept a list of sampling params --------- Co-authored-by:Baber <baber@hey.com>
-
James A. Michaelov authored
* add icelandic_winogrande * fix spacing for final words in sentence
-
Lucia Quirke authored
-
- 02 Sep, 2025 4 commits
-
-
Valle Ruiz-Fernández authored
* Add EsBBQ and CaBBQ tasks * Linter fixes * add esbbq and cabbq to task list --------- Co-authored-by:Júlia Falcão <juliafsfalcao@hotmail.com>
-
James A. Michaelov authored
-
James A. Michaelov authored
-
James A. Michaelov authored
* run linter * add acc_norm
-
- 27 Aug, 2025 3 commits
-
-
Gül Sena A authored
* Fix codex-glue/code2text group issue * Added README * pacify pre-commit --------- Co-authored-by:Baber <baber@hey.com>
-
Baber Abbasi authored
-
Slim Frikha authored
-
- 26 Aug, 2025 1 commit
-
-
Janna authored
* add AIME tasks * standardize the repeats * fix task naming * aime25 only has test set * edit readme * add utils * standardize * fix case sensitivity * repeat once * lint * more linting * lint huggingface.py
-
- 25 Aug, 2025 4 commits
-
-
Weihao XUAN authored
* update MMLU_ProX * update MMLU_ProX * cleanup code by pre-commit
-
Nikita Savelyev authored
* Add support for OVModelForSeq2SeqLM * Add test
-
William Held authored
* Anthropic Discrim Eval * Mixed Effects Regression * Actually wire it all upo * Operator Name Doesn't Exist on Github * Update lm_eval/tasks/discrim_eval/discrim_eval_implicit.yaml Co-authored-by:
Baber Abbasi <92168766+baberabb@users.noreply.github.com> * Update discrim_eval_implicit.yaml * Update discrim_eval_explicit.yaml * pacify pre-commit --------- Co-authored-by:
Baber Abbasi <92168766+baberabb@users.noreply.github.com> Co-authored-by:
Baber <baber@hey.com>
-
Geun, Lim authored
* feat: Add CLIcK task * Fix formatting issues * Add Click Task Description * fix: lint * fix
-
- 23 Aug, 2025 1 commit
-
-
Baber Abbasi authored
* update math_verify * remove normalization * use full solution in `parse` * update version
-
- 22 Aug, 2025 1 commit
-
-
Patrick Haller authored
Co-authored-by:Patrick Haller <phmaker@Patricks-MacBook-Pro.local>
-
- 21 Aug, 2025 9 commits
-
-
James A. Michaelov authored
* add lm_syneval * edit readme * update task readme * formatting fixes * run linting * add descriptions and examples * clean readme formatting
-
James A. Michaelov authored
* add turblimp * update general task readme * add normalized accuracy
-
James A. Michaelov authored
* add blimp_nl * add template yaml file
-
James A. Michaelov authored
* add zhoblimp files * correct group name * fix group * add normalized accuracy
-
FranValero97 authored
-
Kurt Yang authored
Adding support for OpenAI GPT-5 model; Models only support hardcoded tempeature=1 and stop=None (#3247)
-
Anri Lombard authored
-
Jafar Isbarov authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 13 Aug, 2025 1 commit
-
-
Xinhe Shi authored
-
- 08 Aug, 2025 1 commit
-
-
Avelina Asada Hadji-Kyriacou authored
* Update afridiacritics_yaml * Update afrisenti * Update nollysenti * Update ntrex * Update salt
-
- 04 Aug, 2025 5 commits
-
-
Baber Abbasi authored
-
parkhs21 authored
* improve include-path precedence handling * test: add task for test * add test for include path precedence handling * Refactor `test_include_path.py` --------- Co-authored-by:Baber <baber@hey.com>
-
Matthias Neumayer authored
The tasks are called without .yaml just the task name
-
Idan Tene authored
* Update humaneval_64_instruct.yaml Sync doc_to_text with humaneval_instruct.yaml * Update humaneval_instruct.yaml Remove redundant (flawed) spaces * Update README.md * Bump task version
-
Felix Michalak authored
* Update continuation group names to fit Readme * added changelog to readme and switched datasets form hails to cais * added missing new line at end of readme
-