- 24 Sep, 2025 1 commit
-
-
Baber authored
-
- 25 Jul, 2025 1 commit
-
-
Baber authored
-
- 24 Jul, 2025 1 commit
-
-
Baber authored
-
- 23 Jul, 2025 3 commits
-
-
Baber Abbasi authored
* Fix: pin datasets < 4.0 * fix * update type hints in HF * fix hellaswag path
-
Baber authored
-
Baber authored
-
- 22 Jul, 2025 4 commits
- 21 Jul, 2025 4 commits
- 19 Jul, 2025 1 commit
-
-
Baber authored
-
- 18 Jul, 2025 1 commit
-
-
Baber authored
-
- 10 Jul, 2025 2 commits
- 08 Jul, 2025 2 commits
- 07 Jul, 2025 1 commit
-
-
Baber authored
-
- 04 Jul, 2025 2 commits
- 03 Jul, 2025 1 commit
-
-
Baber authored
-
- 01 Jul, 2025 1 commit
-
-
Baber authored
-
- 30 Jun, 2025 6 commits
- 25 Jun, 2025 1 commit
-
-
Kiersten Stokes authored
Signed-off-by:kiersten-stokes <kierstenstokes@gmail.com>
-
- 03 Jun, 2025 1 commit
-
-
Baber Abbasi authored
* fix: bug in acc_mutual_info slicing; add `target_delimiter` to uncond choices * add tests
-
- 21 May, 2025 1 commit
-
-
Baber Abbasi authored
This reverts commit 4dbd5ec9
-
- 19 May, 2025 1 commit
-
-
Baber Abbasi authored
* add `sglang-generate` * nit * nit * nit * pacify pre-commit
-
- 15 May, 2025 1 commit
-
-
Tingchen Fu authored
-
- 16 Apr, 2025 1 commit
-
-
Baber Abbasi authored
* add warning in for default until * fix stop tokens; add vcsum * bugfix:fix doc_to_target to string * fix lsht, trec * add task to readme * add debugging logs for multiple input/output
-
- 07 Apr, 2025 1 commit
-
-
Felipe Maia Polo authored
Add `--samples` Argument for Fine-Grained Task Evaluation in `lm-evaluation-harness`. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2] (#2520) * added option --examples * specifying examples in dictionary * run pre-commit - fix arg type Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com * fixing bug when examples==None * fixing bug when examples==None * limit or examples must be None in simple_evaluate.py and in evaluator.py * run pre-commit (fix formatting) Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com * merge main and run pre-commit (fix formatting) Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com * Update __main__.py undefined "limit" and "examples" * update branch, fix conflicts, run pre-commit * nits * nits * change 'examples' to 'samples' --------- Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com Co-authored-by:
mirianfrsilva <mirianfrsilva@ibm.com> Co-authored-by:
Stella Biderman <stellabiderman@gmail.com> Co-authored-by:
Baber <baber@hey.com>
-
- 18 Mar, 2025 1 commit
-
-
Baber Abbasi authored
suport for longcontext (and other synthetic tasks) * add ruler * add longbench * pass `metadata` to TaskConfig
-
- 14 Mar, 2025 1 commit
-
-
achervyakov authored
* Added audio-modality pipeline for qwen2-audio model * Beauty imports * fix apply_chat_template args * update default audio placeholders list * add demo task - common_voice subset * add audiolm_qwen libs to pyproject.toml * pre-commit beautify --------- Co-authored-by:Alexandra Rak <rakalexandra@mail.ru>
-