Support for AIME dataset (#3248)
* add AIME tasks * standardize the repeats * fix task naming * aime25 only has test set * edit readme * add utils * standardize * fix case sensitivity * repeat once * lint * more linting * lint huggingface.py
Showing
lm_eval/tasks/aime/README.md
0 → 100644
lm_eval/tasks/aime/aime.yaml
0 → 100644
lm_eval/tasks/aime/utils.py
0 → 100644
Please register or sign in to comment