- 18 Nov, 2024 1 commit
-
-
Kozzy Voudouris authored
* Add metabench (Kipnis et al. 2024) * Update metabench tasks for full replication of original benchmarks, using publicly available datasets * Remove unnecessary import * Add permute versions of each task, where the answer orders are randomly shuffled. * Add metabench group for easier evaluations * Fix mmlu counts after removing duplicate * Add secondary datasets * Fix f-string error * Fix f-string error for permute processing * Add original hash to outputs for easy matching to original results * Add line break at end of utils files * Remove extra line from winogrande * Reformat for linters * fix multiple input test * appease pre-commit * Add metabench to tasks README * fix multiple input `test_doc_to_text` --------- Co-authored-by:Baber <baber@hey.com>
-
- 13 May, 2024 1 commit
-
-
Lucas Weber authored
* Add tinyBenchmarks * Add acknowledgements * Add ordering of outputs for data-parallel * Run pre-commit * Add few_shot specifications * Add tinyBenchmarks post-processing * add conditional import ; fix task names --------- Co-authored-by:haileyschoelkopf <hailey@eleuther.ai>
-
- 21 Dec, 2023 1 commit
-
-
Hailey Schoelkopf authored
* change version field formatting in metadata * mention versioning in new task guide * add instructions for changelog * run linters
-
- 28 Nov, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 15 Aug, 2023 1 commit
-
-
lintangsutawika authored
-
- 14 Aug, 2023 1 commit
-
-
lintangsutawika authored
-
- 18 Jul, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 08 Jul, 2023 1 commit
-
-
nopperl authored
-