1. 26 Aug, 2025 1 commit
    • Janna's avatar
      Support for AIME dataset (#3248) · 5ac7cdf8
      Janna authored
      * add AIME tasks
      
      * standardize the repeats
      
      * fix task naming
      
      * aime25 only has test set
      
      * edit readme
      
      * add utils
      
      * standardize
      
      * fix case sensitivity
      
      * repeat once
      
      * lint
      
      * more linting
      
      * lint huggingface.py
      5ac7cdf8
  2. 07 May, 2024 1 commit
  3. 13 Mar, 2024 1 commit
  4. 11 Mar, 2024 1 commit
    • Hailey Schoelkopf's avatar
      AGIEval (#1359) · a3e56afe
      Hailey Schoelkopf authored
      
      
      * add agieval
      
      * fix typo
      
      * add cloze / math exactmatch agieval tasks, rename
      
      * update exact-match agieval tasks, allow for multiple-correct answers
      
      * add more detail to readme
      
      * don't parse_math_answer twice
      
      ---------
      Co-authored-by: default avatarAlex Bäuerle <alex@a13x.io>
      a3e56afe