- 21 Sep, 2025 1 commit
-
-
Luis Cosio authored
* Added benchmark * Added more testing * Added task definition for mmlu_redux and mmlu_redux_spanish * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add remaining MMLU Redux YAMLs and updated tasks README * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs * Add MMLU Redux changes from pr-2705 * Resolve pre-commit hook and pytest overlapping group issues by adding mmlu_redux_spanish task entries and unique subgroup names * Enhance retry logic to prevent 429 error when using Hugging Face API for tests, apply pre-commit fixes * Revert python test changes and comments one task group to avoid Hugging Face rate limit and task failure --------- Co-authored-by:CT-6282 <ricardo.godric@hotmail.com>
-
- 23 Jul, 2025 1 commit
-
-
Baber Abbasi authored
* remove trust-remote-code * add W605 rule
-
- 16 Apr, 2025 1 commit
-
-
Baber Abbasi authored
* switch MMLU to cais/mmlu * switch back to tj-actions/changed-files * cache HF folder
-
- 09 Aug, 2024 1 commit
-
-
Jungwhan Kim authored
* add keep trailing newline * apply ruff-format * add prompt unit test * increment the version of tasks that have description with whitespace * remove white spaces of leaderboard bbh * update MMLU expected versions in output * CI run does display the expected version=1 for mmlu subtasks, fix expected test output again --------- Co-authored-by:haileyschoelkopf <hailey@eleuther.ai>
-
- 26 Jun, 2024 1 commit
-
-
Hailey Schoelkopf authored
* make MMLU trust remote code to fix tests * remove trust remote code
-
- 21 Dec, 2023 1 commit
-
-
Hailey Schoelkopf authored
* change version field formatting in metadata * mention versioning in new task guide * add instructions for changelog * run linters
-
- 28 Nov, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 10 Nov, 2023 1 commit
-
-
lintangsutawika authored
-
- 01 Nov, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 16 Oct, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 06 Oct, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 26 Sep, 2023 1 commit
-
-
Hailey Schoelkopf authored
-
- 21 Sep, 2023 1 commit
-
-
lintangsutawika authored
-
- 04 Sep, 2023 4 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
lintangsutawika authored
-
lintangsutawika authored
-
- 03 Sep, 2023 1 commit
-
-
lintangsutawika authored
-