Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
2b56339e
Commit
2b56339e
authored
Jan 17, 2025
by
Baber
Browse files
Merge branch 'main' into longcxt
parents
0b533339
703fbffd
Changes
316
Show whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
200 additions
and
0 deletions
+200
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_civics.yaml
.../LEV/AraDiCE_ArabicMMLU_middle_social-science_civics.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_economics.yaml
...V/AraDiCE_ArabicMMLU_middle_social-science_economics.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_geography.yaml
...V/AraDiCE_ArabicMMLU_middle_social-science_geography.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_social-science.yaml
...DiCE_ArabicMMLU_middle_social-science_social-science.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_stem_computer-science.yaml
.../LEV/AraDiCE_ArabicMMLU_middle_stem_computer-science.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_stem_natural-science.yaml
...U/LEV/AraDiCE_ArabicMMLU_middle_stem_natural-science.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_humanities_islamic-studies.yaml
...LEV/AraDiCE_ArabicMMLU_na_humanities_islamic-studies.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_language_arabic-language-general.yaml
...aDiCE_ArabicMMLU_na_language_arabic-language-general.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_language_arabic-language-grammar.yaml
...aDiCE_ArabicMMLU_na_language_arabic-language-grammar.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_other_driving-test.yaml
...bicMMLU/LEV/AraDiCE_ArabicMMLU_na_other_driving-test.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_other_general-knowledge.yaml
...LU/LEV/AraDiCE_ArabicMMLU_na_other_general-knowledge.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_humanities_history.yaml
...LU/LEV/AraDiCE_ArabicMMLU_primary_humanities_history.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_humanities_islamic-studies.yaml
...raDiCE_ArabicMMLU_primary_humanities_islamic-studies.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_language_arabic-language.yaml
.../AraDiCE_ArabicMMLU_primary_language_arabic-language.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_other_general-knowledge.yaml
...V/AraDiCE_ArabicMMLU_primary_other_general-knowledge.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_social-science_geography.yaml
.../AraDiCE_ArabicMMLU_primary_social-science_geography.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_social-science_social-science.yaml
...iCE_ArabicMMLU_primary_social-science_social-science.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_computer-science.yaml
...LEV/AraDiCE_ArabicMMLU_primary_stem_computer-science.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_math.yaml
.../ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_math.yaml
+10
-0
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_natural-science.yaml
.../LEV/AraDiCE_ArabicMMLU_primary_stem_natural-science.yaml
+10
-0
No files found.
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_civics.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_social-science_civics"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_social-science_civics_lev"
"
task_alias"
:
"
middle
social-science
civics"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_economics.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_social-science_economics"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_social-science_economics_lev"
"
task_alias"
:
"
middle
social-science
economics"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_geography.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_social-science_geography"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_social-science_geography_lev"
"
task_alias"
:
"
middle
social-science
geography"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_social-science_social-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_social-science_social-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_social-science_social-science_lev"
"
task_alias"
:
"
middle
social-science
social-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_stem_computer-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_stem_computer-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_stem_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_stem_computer-science_lev"
"
task_alias"
:
"
middle
stem
computer-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_middle_stem_natural-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
middle_stem_natural-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_stem_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_middle_stem_natural-science_lev"
"
task_alias"
:
"
middle
stem
natural-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_humanities_islamic-studies.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
na_humanities_islamic-studies"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_humanities_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_na_humanities_islamic-studies_lev"
"
task_alias"
:
"
na
humanities
islamic-studies"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_language_arabic-language-general.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
na_language_arabic-language-general"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_language_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_na_language_arabic-language-general_lev"
"
task_alias"
:
"
na
language
arabic-language-general"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_language_arabic-language-grammar.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
na_language_arabic-language-grammar"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_language_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_na_language_arabic-language-grammar_lev"
"
task_alias"
:
"
na
language
arabic-language-grammar"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_other_driving-test.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
na_other_driving-test"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_other_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_na_other_driving-test_lev"
"
task_alias"
:
"
na
other
driving-test"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_na_other_general-knowledge.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
na_other_general-knowledge"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_other_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_na_other_general-knowledge_lev"
"
task_alias"
:
"
na
other
general-knowledge"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_humanities_history.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_humanities_history"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_humanities_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_humanities_history_lev"
"
task_alias"
:
"
primary
humanities
history"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_humanities_islamic-studies.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_humanities_islamic-studies"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_humanities_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_humanities_islamic-studies_lev"
"
task_alias"
:
"
primary
humanities
islamic-studies"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_language_arabic-language.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_language_arabic-language"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_language_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_language_arabic-language_lev"
"
task_alias"
:
"
primary
language
arabic-language"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_other_general-knowledge.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_other_general-knowledge"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_other_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_other_general-knowledge_lev"
"
task_alias"
:
"
primary
other
general-knowledge"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_social-science_geography.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_social-science_geography"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_social-science_geography_lev"
"
task_alias"
:
"
primary
social-science
geography"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_social-science_social-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_social-science_social-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_social-science_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_social-science_social-science_lev"
"
task_alias"
:
"
primary
social-science
social-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_computer-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_stem_computer-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_stem_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_stem_computer-science_lev"
"
task_alias"
:
"
primary
stem
computer-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_math.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_stem_math"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_stem_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_stem_math_lev"
"
task_alias"
:
"
primary
stem
math"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
lm_eval/tasks/aradice/ArabicMMLU/LEV/AraDiCE_ArabicMMLU_primary_stem_natural-science.yaml
0 → 100644
View file @
2b56339e
"
dataset_name"
:
"
primary_stem_natural-science"
"
description"
:
"
"
"
fewshot_split"
:
!!null
"
null"
"
include"
:
"
_default_template_yaml"
"
tag"
:
"
AraDiCE_ArabicMMLU_stem_lev"
"
task"
:
"
AraDiCE_ArabicMMLU_primary_stem_natural-science_lev"
"
task_alias"
:
"
primary
stem
natural-science"
"
test_split"
:
"
test"
"
training_split"
:
!!null
"
null"
"
validation_split"
:
!!null
"
null"
Prev
1
2
3
4
5
6
7
8
9
…
16
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment