- 24 Jul, 2025 3 commits
-
-
Baber Abbasi authored
-
weiliang authored
-
Baber authored
-
- 23 Jul, 2025 10 commits
-
-
Baber Abbasi authored
* remove trust-remote-code * add W605 rule
-
Michael Goin authored
Device has been a deprecated arg for a few releases of vLLM and is now removed in 0.10.0 https://github.com/vllm-project/vllm/pull/21349
-
Baber Abbasi authored
* Fix: pin datasets < 4.0 * fix * update type hints in HF * fix hellaswag path
-
Avelina Asada Hadji-Kyriacou authored
* added support for additional chat template arguments * use `enable_thinking` * add wrap logging function * add `chat_template_args` back to HF --------- Co-authored-by:Baber <baber@hey.com>
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
- 22 Jul, 2025 7 commits
-
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Svetlana Karimova authored
* Feat: add LIBRA benchmark * Feat: add dataset filter to LIBRA * Fix: formatting through pre-commit and main tasks README * Fix: resolve conflict * Fix: dataset name to real * Fix: delete unnececcary datasets and correct dependency --------- Co-authored-by:Baber Abbasi <92168766+baberabb@users.noreply.github.com>
-
Geun, Lim authored
* Fix: extended to max_gen_toks 8192 for HRM8K math benchmarks * • Increased max_gen_toks to 2 048 (matches Appendix B of original paper). • Added Evaluation Settings and Changelog sections. * add some logs --------- Co-authored-by:Baber <baber@hey.com>
-
- 21 Jul, 2025 15 commits
-
-
Baber authored
feat: implement check_gold_index_error utility and refactor process_results for improved error handling. remove generate_until multiple-choice
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
Baber authored
-
- 19 Jul, 2025 4 commits
-
-
Baber Abbasi authored
-
James A. Michaelov authored
* add multiblimp * run linter
-
Avelina Asada Hadji-Kyriacou authored
* Update default.yaml
-
Baber authored
-
- 18 Jul, 2025 1 commit
-
-
Baber authored
-