Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
0b45cc71
Unverified
Commit
0b45cc71
authored
Aug 26, 2025
by
Weihao XUAN
Committed by
GitHub
Aug 25, 2025
Browse files
Update MMLU-ProX task (#3174)
* update MMLU_ProX * update MMLU_ProX * cleanup code by pre-commit
parent
05b37f20
Changes
741
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
203 additions
and
0 deletions
+203
-0
lm_eval/tasks/mmlu_prox/ja/_mmlu_prox_lite_ja.yaml
lm_eval/tasks/mmlu_prox/ja/_mmlu_prox_lite_ja.yaml
+23
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_biology.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_biology.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_business.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_business.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_chemistry.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_chemistry.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_computer_science.yaml
...asks/mmlu_prox/ja/mmlu_prox_lite_ja_computer_science.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_economics.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_economics.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_engineering.yaml
...val/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_engineering.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_health.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_health.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_history.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_history.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_law.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_law.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_math.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_math.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_other.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_other.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_philosophy.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_philosophy.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_physics.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_physics.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_psychology.yaml
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_psychology.yaml
+7
-0
lm_eval/tasks/mmlu_prox/ko/_ko_lite_template_yaml
lm_eval/tasks/mmlu_prox/ko/_ko_lite_template_yaml
+35
-0
lm_eval/tasks/mmlu_prox/ko/_mmlu_prox_lite_ko.yaml
lm_eval/tasks/mmlu_prox/ko/_mmlu_prox_lite_ko.yaml
+23
-0
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_biology.yaml
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_biology.yaml
+8
-0
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_business.yaml
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_business.yaml
+8
-0
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_chemistry.yaml
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_chemistry.yaml
+8
-0
No files found.
lm_eval/tasks/mmlu_prox/ja/_mmlu_prox_lite_ja.yaml
0 → 100644
View file @
0b45cc71
group
:
mmlu_prox_lite_ja
task
:
-
mmlu_prox_lite_ja_biology
-
mmlu_prox_lite_ja_business
-
mmlu_prox_lite_ja_chemistry
-
mmlu_prox_lite_ja_computer_science
-
mmlu_prox_lite_ja_economics
-
mmlu_prox_lite_ja_engineering
-
mmlu_prox_lite_ja_health
-
mmlu_prox_lite_ja_history
-
mmlu_prox_lite_ja_law
-
mmlu_prox_lite_ja_math
-
mmlu_prox_lite_ja_other
-
mmlu_prox_lite_ja_philosophy
-
mmlu_prox_lite_ja_physics
-
mmlu_prox_lite_ja_psychology
aggregate_metric_list
:
-
aggregation
:
mean
metric
:
exact_match
weight_by_size
:
true
filter_list
:
custom-extract
metadata
:
version
:
0.0
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_biology.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は生物学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_biology
task_alias
:
biology
process_docs
:
!function
utils.process_biology
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_business.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下はビジネスに関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_business
task_alias
:
business
process_docs
:
!function
utils.process_business
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_chemistry.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は化学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_chemistry
task_alias
:
chemistry
process_docs
:
!function
utils.process_chemistry
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_computer_science.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下はコンピュータサイエンスに関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_computer_science
task_alias
:
computer_science
process_docs
:
!function
utils.process_computer_science
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_economics.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は経済学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_economics
task_alias
:
economics
process_docs
:
!function
utils.process_economics
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_engineering.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は工学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_engineering
task_alias
:
engineering
process_docs
:
!function
utils.process_engineering
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_health.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は健康科学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_health
task_alias
:
health
process_docs
:
!function
utils.process_health
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_history.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は歴史に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_history
task_alias
:
history
process_docs
:
!function
utils.process_history
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_law.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は法律に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_law
task_alias
:
law
process_docs
:
!function
utils.process_law
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_math.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は数学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_math
task_alias
:
math
process_docs
:
!function
utils.process_math
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_other.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下はその他に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_other
task_alias
:
other
process_docs
:
!function
utils.process_other
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_philosophy.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は哲学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_philosophy
task_alias
:
philosophy
process_docs
:
!function
utils.process_philosophy
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_physics.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は物理学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_physics
task_alias
:
physics
process_docs
:
!function
utils.process_physics
lm_eval/tasks/mmlu_prox/ja/mmlu_prox_lite_ja_psychology.yaml
0 → 100644
View file @
0b45cc71
description
:
'
以下は心理学に関する選択問題(解答付き)です。段階的に考え、最後に「答えは
(X)
です」と回答を締めくくってください。Xは正解の選択肢を示す文字です。
'
include
:
_ja_lite_template_yaml
task
:
mmlu_prox_lite_ja_psychology
task_alias
:
psychology
process_docs
:
!function
utils.process_psychology
lm_eval/tasks/mmlu_prox/ko/_ko_lite_template_yaml
0 → 100644
View file @
0b45cc71
dataset_path: li-lab/MMLU-ProX-Lite
dataset_name: ko
test_split: test
fewshot_split: validation
fewshot_config:
sampler: first_n
doc_to_text: !function utils.fewshot_to_text
doc_to_target: ""
output_type: generate_until
doc_to_text: !function utils.doc_to_text
doc_to_target: answer
filter_list:
- name: "custom-extract"
filter:
- function: "regex"
regex_pattern: '답은 \(?([ABCDEFGHIJ])\)?입니다'
- function: "take_first"
generation_kwargs:
until:
- "</s>"
- "Q:"
- "질문:"
- "<|im_end|>"
do_sample: false
temperature: 0.0
max_gen_toks: 2048
num_fewshot: 5
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
metadata:
version: 0.0
lm_eval/tasks/mmlu_prox/ko/_mmlu_prox_lite_ko.yaml
0 → 100644
View file @
0b45cc71
group
:
mmlu_prox_lite_ko
task
:
-
mmlu_prox_lite_ko_biology
-
mmlu_prox_lite_ko_business
-
mmlu_prox_lite_ko_chemistry
-
mmlu_prox_lite_ko_computer_science
-
mmlu_prox_lite_ko_economics
-
mmlu_prox_lite_ko_engineering
-
mmlu_prox_lite_ko_health
-
mmlu_prox_lite_ko_history
-
mmlu_prox_lite_ko_law
-
mmlu_prox_lite_ko_math
-
mmlu_prox_lite_ko_other
-
mmlu_prox_lite_ko_philosophy
-
mmlu_prox_lite_ko_physics
-
mmlu_prox_lite_ko_psychology
aggregate_metric_list
:
-
aggregation
:
mean
metric
:
exact_match
weight_by_size
:
true
filter_list
:
custom-extract
metadata
:
version
:
0.0
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_biology.yaml
0 → 100644
View file @
0b45cc71
description
:
'
다음은
생물학에
관한
객관식
문제(정답
포함)입니다.
단계적으로
생각한
다음
"답은
(X)입니다"로
답변을
마무리하세요.
여기서
X는
올바른
선택지
문자입니다.
'
include
:
_ko_lite_template_yaml
task
:
mmlu_prox_lite_ko_biology
task_alias
:
biology
process_docs
:
!function
utils.process_biology
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_business.yaml
0 → 100644
View file @
0b45cc71
description
:
'
다음은
경영학에
관한
객관식
문제(정답
포함)입니다.
단계적으로
생각한
다음
"답은
(X)입니다"로
답변을
마무리하세요.
여기서
X는
올바른
선택지
문자입니다.
'
include
:
_ko_lite_template_yaml
task
:
mmlu_prox_lite_ko_business
task_alias
:
business
process_docs
:
!function
utils.process_business
lm_eval/tasks/mmlu_prox/ko/mmlu_prox_lite_ko_chemistry.yaml
0 → 100644
View file @
0b45cc71
description
:
'
다음은
화학에
관한
객관식
문제(정답
포함)입니다.
단계적으로
생각한
다음
"답은
(X)입니다"로
답변을
마무리하세요.
여기서
X는
올바른
선택지
문자입니다.
'
include
:
_ko_lite_template_yaml
task
:
mmlu_prox_lite_ko_chemistry
task_alias
:
chemistry
process_docs
:
!function
utils.process_chemistry
Prev
1
…
11
12
13
14
15
16
17
18
19
…
38
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment