- 11 Aug, 2025 13 commits
- 08 Aug, 2025 1 commit
-
-
Avelina Asada Hadji-Kyriacou authored
* Update afridiacritics_yaml * Update afrisenti * Update nollysenti * Update ntrex * Update salt
-
- 04 Aug, 2025 5 commits
-
-
Baber Abbasi authored
-
parkhs21 authored
* improve include-path precedence handling * test: add task for test * add test for include path precedence handling * Refactor `test_include_path.py` --------- Co-authored-by:Baber <baber@hey.com>
-
Matthias Neumayer authored
The tasks are called without .yaml just the task name
-
Idan Tene authored
* Update humaneval_64_instruct.yaml Sync doc_to_text with humaneval_instruct.yaml * Update humaneval_instruct.yaml Remove redundant (flawed) spaces * Update README.md * Bump task version
-
Felix Michalak authored
* Update continuation group names to fit Readme * added changelog to readme and switched datasets form hails to cais * added missing new line at end of readme
-
- 02 Aug, 2025 1 commit
-
-
Cyrus Leung authored
* Update vLLM compatibility Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> * add TokensPrompt to all generate calls --------- Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
Baber <baber@hey.com>
-
- 24 Jul, 2025 2 commits
-
-
Baber Abbasi authored
-
weiliang authored
-
- 23 Jul, 2025 4 commits
-
-
Baber Abbasi authored
* remove trust-remote-code * add W605 rule
-
Michael Goin authored
Device has been a deprecated arg for a few releases of vLLM and is now removed in 0.10.0 https://github.com/vllm-project/vllm/pull/21349
-
Baber Abbasi authored
* Fix: pin datasets < 4.0 * fix * update type hints in HF * fix hellaswag path
-
Avelina Asada Hadji-Kyriacou authored
* added support for additional chat template arguments * use `enable_thinking` * add wrap logging function * add `chat_template_args` back to HF --------- Co-authored-by:Baber <baber@hey.com>
-
- 22 Jul, 2025 2 commits
-
-
Svetlana Karimova authored
* Feat: add LIBRA benchmark * Feat: add dataset filter to LIBRA * Fix: formatting through pre-commit and main tasks README * Fix: resolve conflict * Fix: dataset name to real * Fix: delete unnececcary datasets and correct dependency --------- Co-authored-by:Baber Abbasi <92168766+baberabb@users.noreply.github.com>
-
Geun, Lim authored
* Fix: extended to max_gen_toks 8192 for HRM8K math benchmarks * • Increased max_gen_toks to 2 048 (matches Appendix B of original paper). • Added Evaluation Settings and Changelog sections. * add some logs --------- Co-authored-by:Baber <baber@hey.com>
-
- 19 Jul, 2025 4 commits
-
-
Avelina Asada Hadji-Kyriacou authored
* Added missing fixture in test_unitxt_tasks.py * pacify pre-commit --------- Co-authored-by:Baber Abbasi <92168766+baberabb@users.noreply.github.com>
-
Baber Abbasi authored
-
James A. Michaelov authored
* add multiblimp * run linter
-
Avelina Asada Hadji-Kyriacou authored
* Update default.yaml
-
- 18 Jul, 2025 3 commits
-
-
Ramiro R. C. authored
* added headers and custom model name | fixed bug with trust_remote_code param * linting * removed custom model name | changed headers override * add `header` to base TemplateAPI * nit --------- Co-authored-by:Baber <baber@hey.com>
-
mans authored
* fix request hanging when request api * pre commit --------- Co-authored-by:qinyidao <qinyidao@moonshot.cn>
-
Idan Tene authored
* Update utils.py
-
- 16 Jul, 2025 2 commits
-
-
philipdoldo authored
* Removed the 'Let''s think step by step.' text from the start of the target entry in each of the samples to prevent this phrase from being repeated twice in the few-shot prompts and to match the behavior from the original bbh repository. Worth noting that this applied to only 26 out of 27 subtasks, the only one it did not apply to is boolean_expressions.yaml. When it comes to boolean_expressions.yaml, in my opinion there is an error in that it doesn't say the 'Remember that (i) ...' text after the final 'A: Let's think step by step.' in the prompt. Models like EleutherAI/gpt-neo-125m seem to always begin answers with this string anyway (copying what was done in the few-shot prompts), but I think it really should've been part of the prompt, much like how 'A: Let's think step by step.' is included in the prompt for all of the cot tasks. However, the original bbh repo also has this issue, so I think it is fine to keep it this way for consistency, but just thought I'd point it out anyway. * feat: remove extra space from answers; add changelog --------- Co-authored-by:Baber <baber@hey.com>
-
Baber Abbasi authored
* feat: add postprocessing for generated text to strip stop sequences and thinking tokens * nit * fix: trim leading whitespace after stripping thinking tokens from generation * feat: add think_end_token to model_args * nit * nit * nit * add to readme * nit
-
- 15 Jul, 2025 1 commit
-
-
MaYongQing authored
-
- 14 Jul, 2025 2 commits
-
-
Ankit Gola authored
-
Avelina Asada Hadji-Kyriacou authored
-