Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
5c006ed4
Unverified
Commit
5c006ed4
authored
Jan 25, 2025
by
Minho Ryu
Committed by
GitHub
Jan 24, 2025
Browse files
separate category for `global_mmlu` (#2652)
* separate category * set version 0.0 * apply precommit
parent
370e2f9e
Changes
193
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
143 additions
and
44 deletions
+143
-44
lm_eval/tasks/global_mmlu/default/_generate_configs.py
lm_eval/tasks/global_mmlu/default/_generate_configs.py
+0
-42
lm_eval/tasks/global_mmlu/default/ar/_ar_template_yaml
lm_eval/tasks/global_mmlu/default/ar/_ar_template_yaml
+1
-2
lm_eval/tasks/global_mmlu/default/ar/_global_mmlu_ar.yaml
lm_eval/tasks/global_mmlu/default/ar/_global_mmlu_ar.yaml
+13
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_business.yaml
...tasks/global_mmlu/default/ar/global_mmlu_ar_business.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_humanities.yaml
...sks/global_mmlu/default/ar/global_mmlu_ar_humanities.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_medical.yaml
.../tasks/global_mmlu/default/ar/global_mmlu_ar_medical.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_other.yaml
...al/tasks/global_mmlu/default/ar/global_mmlu_ar_other.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_social_sciences.yaml
...lobal_mmlu/default/ar/global_mmlu_ar_social_sciences.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_stem.yaml
...val/tasks/global_mmlu/default/ar/global_mmlu_ar_stem.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/ar/utils.py
lm_eval/tasks/global_mmlu/default/ar/utils.py
+18
-0
lm_eval/tasks/global_mmlu/default/bn/_bn_template_yaml
lm_eval/tasks/global_mmlu/default/bn/_bn_template_yaml
+16
-0
lm_eval/tasks/global_mmlu/default/bn/_global_mmlu_bn.yaml
lm_eval/tasks/global_mmlu/default/bn/_global_mmlu_bn.yaml
+13
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_business.yaml
...tasks/global_mmlu/default/bn/global_mmlu_bn_business.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_humanities.yaml
...sks/global_mmlu/default/bn/global_mmlu_bn_humanities.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_medical.yaml
.../tasks/global_mmlu/default/bn/global_mmlu_bn_medical.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_other.yaml
...al/tasks/global_mmlu/default/bn/global_mmlu_bn_other.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_social_sciences.yaml
...lobal_mmlu/default/bn/global_mmlu_bn_social_sciences.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_stem.yaml
...val/tasks/global_mmlu/default/bn/global_mmlu_bn_stem.yaml
+4
-0
lm_eval/tasks/global_mmlu/default/bn/utils.py
lm_eval/tasks/global_mmlu/default/bn/utils.py
+18
-0
lm_eval/tasks/global_mmlu/default/de/_de_template_yaml
lm_eval/tasks/global_mmlu/default/de/_de_template_yaml
+16
-0
No files found.
lm_eval/tasks/global_mmlu/default/_generate_configs.py
deleted
100644 → 0
View file @
370e2f9e
import
yaml
languages
=
[
"en"
,
"ar"
,
"fr"
,
"es"
,
"hi"
,
"de"
,
"id"
,
"it"
,
"ja"
,
"ko"
,
"pt"
,
"zh"
,
"yo"
,
"bn"
,
"sw"
,
]
def
main
()
->
None
:
for
language
in
languages
:
file_name
=
f
"global_mmlu_
{
language
}
.yaml"
try
:
with
open
(
f
"
{
file_name
}
"
,
"w"
)
as
f
:
f
.
write
(
"# Generated by _generate_configs.py
\n
"
)
yaml
.
dump
(
{
"include"
:
"_default_yaml"
,
"task"
:
f
"global_mmlu_
{
language
}
"
,
"dataset_name"
:
language
,
},
f
,
)
except
FileExistsError
:
pass
if
__name__
==
"__main__"
:
main
()
lm_eval/tasks/global_mmlu/default/
_default
_yaml
→
lm_eval/tasks/global_mmlu/default/
ar/_ar_template
_yaml
View file @
5c006ed4
tag:
- global_mmlu
dataset_path: CohereForAI/Global-MMLU-Lite
dataset_name: ar
test_split: test
fewshot_split: dev
fewshot_config:
...
...
lm_eval/tasks/global_mmlu/default/ar/_global_mmlu_ar.yaml
0 → 100644
View file @
5c006ed4
group
:
global_mmlu_ar
task
:
-
global_mmlu_ar_business
-
global_mmlu_ar_humanities
-
global_mmlu_ar_medical
-
global_mmlu_ar_other
-
global_mmlu_ar_stem
-
global_mmlu_ar_social_sciences
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
metadata
:
version
:
0.0
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_business.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_business
task
:
global_mmlu_ar_business
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_humanities.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_humanities
task
:
global_mmlu_ar_humanities
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_medical.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_medical
task
:
global_mmlu_ar_medical
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_other.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_other
task
:
global_mmlu_ar_other
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_social_sciences.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_social_sciences
task
:
global_mmlu_ar_social_sciences
lm_eval/tasks/global_mmlu/default/ar/global_mmlu_ar_stem.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_ar_template_yaml
process_docs
:
!function
utils.process_stem
task
:
global_mmlu_ar_stem
lm_eval/tasks/global_mmlu/default/ar/utils.py
0 → 100644
View file @
5c006ed4
from
functools
import
partial
CATEGORIES
=
[
"Business"
,
"Humanities"
,
"Medical"
,
"Other"
,
"STEM"
,
"Social Sciences"
]
def
process_docs
(
dataset
,
category
):
return
dataset
.
filter
(
lambda
x
:
x
[
"subject_category"
]
==
category
)
process_functions
=
{
f
"process_
{
category
.
lower
().
replace
(
' '
,
'_'
)
}
"
:
partial
(
process_docs
,
category
=
category
)
for
category
in
CATEGORIES
}
globals
().
update
(
process_functions
)
lm_eval/tasks/global_mmlu/default/bn/_bn_template_yaml
0 → 100644
View file @
5c006ed4
dataset_path: CohereForAI/Global-MMLU-Lite
dataset_name: bn
test_split: test
fewshot_split: dev
fewshot_config:
sampler: default
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\nD. {{option_d}}\nAnswer:"
doc_to_choice: ["A", "B", "C", "D"]
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
lm_eval/tasks/global_mmlu/default/bn/_global_mmlu_bn.yaml
0 → 100644
View file @
5c006ed4
group
:
global_mmlu_bn
task
:
-
global_mmlu_bn_business
-
global_mmlu_bn_humanities
-
global_mmlu_bn_medical
-
global_mmlu_bn_other
-
global_mmlu_bn_stem
-
global_mmlu_bn_social_sciences
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
metadata
:
version
:
0.0
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_business.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_business
task
:
global_mmlu_bn_business
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_humanities.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_humanities
task
:
global_mmlu_bn_humanities
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_medical.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_medical
task
:
global_mmlu_bn_medical
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_other.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_other
task
:
global_mmlu_bn_other
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_social_sciences.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_social_sciences
task
:
global_mmlu_bn_social_sciences
lm_eval/tasks/global_mmlu/default/bn/global_mmlu_bn_stem.yaml
0 → 100644
View file @
5c006ed4
# Generated by _generate_configs.py
include
:
_bn_template_yaml
process_docs
:
!function
utils.process_stem
task
:
global_mmlu_bn_stem
lm_eval/tasks/global_mmlu/default/bn/utils.py
0 → 100644
View file @
5c006ed4
from
functools
import
partial
CATEGORIES
=
[
"Business"
,
"Humanities"
,
"Medical"
,
"Other"
,
"STEM"
,
"Social Sciences"
]
def
process_docs
(
dataset
,
category
):
return
dataset
.
filter
(
lambda
x
:
x
[
"subject_category"
]
==
category
)
process_functions
=
{
f
"process_
{
category
.
lower
().
replace
(
' '
,
'_'
)
}
"
:
partial
(
process_docs
,
category
=
category
)
for
category
in
CATEGORIES
}
globals
().
update
(
process_functions
)
lm_eval/tasks/global_mmlu/default/de/_de_template_yaml
0 → 100644
View file @
5c006ed4
dataset_path: CohereForAI/Global-MMLU-Lite
dataset_name: de
test_split: test
fewshot_split: dev
fewshot_config:
sampler: default
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\nD. {{option_d}}\nAnswer:"
doc_to_choice: ["A", "B", "C", "D"]
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
Prev
1
2
3
4
5
…
10
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment