1. 21 Sep, 2025 1 commit
    • Luis Cosio's avatar
      feat: Add mmlu-redux and it's spanish transaltion as generative task definitions (#2705) · fec9dde7
      Luis Cosio authored
      
      
      * Added benchmark
      
      * Added more testing
      
      * Added task definition for mmlu_redux and mmlu_redux_spanish
      
      * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs
      
      * Add remaining MMLU Redux YAMLs and updated tasks README
      
      * Add MMLU Redux English and Spanish tasks with YAML fixes and READMEs
      
      * Add MMLU Redux changes from pr-2705
      
      * Resolve pre-commit hook and pytest overlapping group issues by adding mmlu_redux_spanish task entries and unique subgroup names
      
      * Enhance retry logic to prevent 429 error when using Hugging Face API for tests, apply pre-commit fixes
      
      * Revert python test changes and comments one task group to avoid Hugging Face rate limit and task failure
      
      ---------
      Co-authored-by: default avatarCT-6282 <ricardo.godric@hotmail.com>
      fec9dde7
  2. 23 Jul, 2025 1 commit
  3. 16 Apr, 2025 1 commit
  4. 09 Aug, 2024 1 commit
    • Jungwhan Kim's avatar
      keep new line for task description (#2116) · 8ad598df
      Jungwhan Kim authored
      
      
      * add keep trailing newline
      
      * apply ruff-format
      
      * add prompt unit test
      
      * increment the version of tasks that have description with whitespace
      
      * remove white spaces of leaderboard bbh
      
      * update MMLU expected versions in output
      
      * CI run does display the expected version=1 for mmlu subtasks, fix expected test output again
      
      ---------
      Co-authored-by: default avatarhaileyschoelkopf <hailey@eleuther.ai>
      8ad598df
  5. 26 Jun, 2024 1 commit
  6. 21 Dec, 2023 1 commit
  7. 28 Nov, 2023 2 commits
  8. 10 Nov, 2023 1 commit
  9. 01 Nov, 2023 1 commit
  10. 16 Oct, 2023 2 commits
  11. 06 Oct, 2023 2 commits
  12. 26 Sep, 2023 1 commit
  13. 21 Sep, 2023 1 commit
  14. 04 Sep, 2023 4 commits
  15. 03 Sep, 2023 1 commit