Add AIME to task description (#3296)

* register aime * lint --------- Co-authored-by: Baber <baber@hey.com>

Add AIME to task description (#3296)
* register aime * lint --------- Co-authored-by: Baber <baber@hey.com>
6b8ec144 · Janna · GitHub · ccfa4ad1 · 6b8ec144
Unverified Commit 6b8ec144 authored Sep 20, 2025 by Janna Committed by GitHub Sep 21, 2025
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 2 deletions

lm_eval/tasks/README.md lm_eval/tasks/README.md +3 -2

No files found.
--- a/lm_eval/tasks/README.md
+++ b/lm_eval/tasks/README.md
@@ -12,6 +12,7 @@ provided to the individual README.md files for each subfolder.
 | [acp_bench_hard](acpbench/README.md)                                     | Tasks evaluating the reasoning ability about Action, Change, and Planning                                                                                                                                                                                                                                                              | English                                                                                                                       |
 | [aexams](aexams/README.md)                                               | Tasks in Arabic related to various academic exams covering a range of subjects.                                                                                                                                                                                                                                                        | Arabic                                                                                                                        |
 | [agieval](agieval/README.md)                                             | Tasks involving historical data or questions related to history and historical texts.                                                                                                                                                                                                                                                  | English, Chinese                                                                                                              |
+| [aime](aime/README.md)                                                   | High school math competition questions                                                                                                                                                                                                                                                                                                 | English                                                                                                                       |
 | [anli](anli/README.md)                                                   | Adversarial natural language inference tasks designed to test model robustness.                                                                                                                                                                                                                                                        | English                                                                                                                       |
 | [arabic_leaderboard_complete](arabic_leaderboard_complete/README.md)     | A full version of the tasks in the Open Arabic LLM Leaderboard, focusing on the evaluation of models that reflect the characteristics of Arabic language understanding and comprehension, culture, and heritage. Note that some of these tasks are machine-translated.                                                                 | Arabic (Some MT)                                                                                                              |
 | [arabic_leaderboard_light](arabic_leaderboard_light/README.md)           | A light version of the tasks in the Open Arabic LLM Leaderboard (i.e., 10% samples of the test set in the original benchmarks), focusing on the evaluation of models that reflect the characteristics of Arabic language understanding and comprehension, culture, and heritage. Note that some of these tasks are machine-translated. | Arabic (Some MT)                                                                                                              |