"git@developer.sourcefind.cn:sugon_wxj/megatron-lm.git" did not exist on "996aca68639bd0550c4c7d46a40ab10f4789506a"
Unverified Commit 8aeff141 authored by heli-qi's avatar heli-qi Committed by GitHub
Browse files

Add MMLU-ProX task (#2811)

* update mmlu_prox configs

* update tasks/README

* correct hyphon to underline in task/README

* update pre-commit codes
parent 8028a42f
description: '다음은 생물학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_biology
task_alias: biology
process_docs: !function utils.process_biology
description: '다음은 경영학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_business
task_alias: business
process_docs: !function utils.process_business
description: '다음은 화학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_chemistry
task_alias: chemistry
process_docs: !function utils.process_chemistry
description: '다음은 컴퓨터 과학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_computer_science
task_alias: computer_science
process_docs: !function utils.process_computer_science
description: '다음은 경제학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_economics
task_alias: economics
process_docs: !function utils.process_economics
description: '다음은 공학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_engineering
task_alias: engineering
process_docs: !function utils.process_engineering
description: '다음은 건강에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_health
task_alias: health
process_docs: !function utils.process_health
description: '다음은 역사에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_history
task_alias: history
process_docs: !function utils.process_history
description: '다음은 법률에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_law
task_alias: law
process_docs: !function utils.process_law
description: '다음은 수학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_math
task_alias: math
process_docs: !function utils.process_math
description: '다음은 기타에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_other
task_alias: other
process_docs: !function utils.process_other
description: '다음은 철학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요. 여기서
X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_philosophy
task_alias: philosophy
process_docs: !function utils.process_philosophy
description: '다음은 물리학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_physics
task_alias: physics
process_docs: !function utils.process_physics
description: '다음은 심리학에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "답은 (X)입니다"로 답변을 마무리하세요.
여기서 X는 올바른 선택지 문자입니다.
'
include: _ko_template_yaml
task: mmlu_prox_ko_psychology
task_alias: psychology
process_docs: !function utils.process_psychology
from functools import partial
from os.path import basename, dirname
from lm_eval.tasks.mmlu_prox.lang_libs import LANG_LIBS
lang_abbr = basename(dirname(__file__))
lang_dict = LANG_LIBS[lang_abbr]
choices = [
"A",
"B",
"C",
"D",
"E",
"F",
"G",
"H",
"I",
"J",
"K",
"L",
"M",
"N",
"O",
"P",
]
max_opt_num = 10
def format_cot_example(example, including_answer=True):
prompt = f"{lang_dict[0]}\n"
question = example["question"]
prompt += question + "\n"
prompt += f"{lang_dict[1]}\n"
for i in range(max_opt_num):
opt = example[f"option_{i}"]
if opt is not None:
prompt += "{}. {}\n".format(choices[i], opt)
if including_answer:
cot_content = example["cot_content"].replace(lang_dict[4], lang_dict[2])
prompt += cot_content + "\n\n"
else:
prompt += lang_dict[2]
return prompt
doc_to_text = partial(format_cot_example, including_answer=False)
fewshot_to_text = partial(format_cot_example, including_answer=True)
def process_docs(dataset, subject):
return dataset.filter(lambda x: x["category"] == subject)
process_biology = partial(process_docs, subject="biology")
process_business = partial(process_docs, subject="business")
process_chemistry = partial(process_docs, subject="chemistry")
process_computer_science = partial(process_docs, subject="computer science")
process_economics = partial(process_docs, subject="economics")
process_engineering = partial(process_docs, subject="engineering")
process_health = partial(process_docs, subject="health")
process_history = partial(process_docs, subject="history")
process_law = partial(process_docs, subject="law")
process_math = partial(process_docs, subject="math")
process_other = partial(process_docs, subject="other")
process_philosophy = partial(process_docs, subject="philosophy")
process_physics = partial(process_docs, subject="physics")
process_psychology = partial(process_docs, subject="psychology")
LANG_LIBS = {
"en": [
"Question:",
"Options:",
"Answer: Let's think step by step.",
'The following are multiple choice questions (with answers) about {subject}. Think step by step and then finish your answer with "{ans_suffix}" where X is the correct letter choice.',
"A: Let's think step by step.",
"the answer is ({})",
],
"ja": [
"質問:",
"選択肢:",
"回答:一歩一歩考えていきましょう。",
"以下は{subject}に関する選択問題(解答付き)です。段階的に考え、最後に「{ans_suffix}」と回答を締めくくってください。Xは正解の選択肢を示す文字です。",
"A: 一歩一歩考えていきましょう。",
"答えは ({}) です",
],
"zh": [
"问题:",
"选项:",
"答案:让我们一步一步地思考。",
'以下是关于{subject}的选择题(带有答案)。请逐步思考,然后以"{ans_suffix}"结束您的回答,其中X是正确的选项字母。',
"A: 让我们一步一步地思考。",
"答案是 ({})",
],
"ko": [
"질문:",
"선택 사항:",
"답변: 한 단계씩 생각해 봅시다.",
'다음은 {subject}에 관한 객관식 문제(정답 포함)입니다. 단계적으로 생각한 다음 "{ans_suffix}"로 답변을 마무리하세요. 여기서 X는 올바른 선택지 문자입니다.',
"A: 한 단계씩 생각해 봅시다.",
"답은 ({})입니다",
],
"fr": [
"Question :",
"Options :",
"Réponse : Réfléchissons étape par étape.",
'Voici des questions à choix multiples (avec réponses) sur {subject}. Réfléchissez étape par étape, puis terminez votre réponse par "{ans_suffix}" où X est la lettre correspondant au bon choix.',
"A: Réfléchissons étape par étape.",
"La réponse est ({})",
],
"de": [
"Frage:",
"Optionen:",
"Antwort: Denken wir Schritt für Schritt nach.",
'Im Folgenden sind Multiple-Choice-Fragen (mit Antworten) zu {subject}. Denken Sie Schritt für Schritt nach und beenden Sie Ihre Antwort mit "{ans_suffix}", wobei X der richtige Buchstabe ist.',
"A: Denken wir Schritt für Schritt nach.",
"Die Antwort ist ({})",
],
"es": [
"Pregunta:",
"Opciones:",
"Respuesta: Pensemos paso a paso.",
'Las siguientes son preguntas de opción múltiple (con respuestas) sobre {subject}. Piense paso a paso y luego termine su respuesta con "{ans_suffix}" donde X es la letra de la opción correcta.',
"A: Pensemos paso a paso.",
"La respuesta es ({})",
],
"pt": [
"Pergunta:",
"Opções:",
"Resposta: Vamos pensar passo a passo.",
'A seguir estão perguntas de múltipla escolha (com respostas) sobre {subject}. Pense passo a passo e termine sua resposta com "{ans_suffix}" onde X é a letra da opção correta.',
"A: Vamos pensar passo a passo.",
"A resposta é ({})",
],
"sw": [
"Swali:",
"Chaguo:",
"Jibu: Hebu tufikiria hatua kwa hatua.",
'Yafuatayo ni maswali ya chaguo-nyingi (yenye majibu) kuhusu {subject}. Fikiria hatua kwa hatua kisha malizia jibu lako kwa "{ans_suffix}" ambapo X ni herufi ya chaguo sahihi.',
"A: Hebu tufikiria hatua kwa hatua.",
"Jibu ni ({})",
],
"th": [
"คำถาม:",
"ตัวเลือก:",
"คำตอบ: มาคิดทีละขั้นตอนกัน",
'ต่อไปนี้เป็นคำถามปรนัย (พร้อมคำตอบ) เกี่ยวกับ {subject} คิดทีละขั้นตอนแล้วสรุปคำตอบด้วย "{ans_suffix}" โดยที่ X คือตัวอักษรที่เป็นตัวเลือกที่ถูกต้อง',
"A: มาคิดทีละขั้นตอนกัน",
"คำตอบคือ ({})",
],
"ar": [
"سؤال:",
"الخيارات:",
"الإجابة: دعنا نفكر خطوة بخطوة.",
"فيما يلي أسئلة اختيار من متعدد (مع إجابات) حول {subject}. فكر خطوة بخطوة ثم أنهِ إجابتك بـ '{ans_suffix}' حيث X هو حرف الخيار الصحيح.",
"أ: دعنا نفكر خطوة بخطوة.",
"الإجابة هي ({})",
],
"hi": [
"प्रश्न:",
"विकल्प:",
"उत्तर: चलिए चरण-दर-चरण सोचते हैं।",
'निम्नलिखित {subject} के बारे में बहुविकल्पीय प्रश्न (उत्तरों के साथ) हैं। चरण-दर-चरण सोचें और फिर अपने उत्तर को "{ans_suffix}" के साथ समाप्त करें जहां X सही विकल्प का अक्षर है।',
"A: चलिए चरण-दर-चरण सोचते हैं।",
"उत्तर है ({})",
],
"bn": [
"প্রশ্ন:",
"বিকল্পগুলি:",
"উত্তর: আসুন ধাপে ধাপে চিন্তা করি।",
'নিম্নলিখিত {subject} সম্পর্কে বহুনির্বাচনী প্রশ্ন (উত্তরসহ)। ধাপে ধাপে চিন্তা করুন এবং তারপর আপনার উত্তর "{ans_suffix}" দিয়ে শেষ করুন যেখানে X হল সঠিক বিকল্পের অক্ষর।',
"A: আসুন ধাপে ধাপে চিন্তা করি।",
"উত্তর হল ({})",
],
}
LANG_SUBJECTS = {
"en": {
"biology": "biology",
"business": "business",
"chemistry": "chemistry",
"computer_science": "computer_science",
"economics": "economics",
"engineering": "engineering",
"health": "health",
"history": "history",
"law": "law",
"math": "math",
"other": "other",
"philosophy": "philosophy",
"physics": "physics",
"psychology": "psychology",
},
"ja": {
"biology": "生物学",
"business": "ビジネス",
"chemistry": "化学",
"computer_science": "コンピュータサイエンス",
"economics": "経済学",
"engineering": "工学",
"health": "健康科学",
"history": "歴史",
"law": "法律",
"math": "数学",
"other": "その他",
"philosophy": "哲学",
"physics": "物理学",
"psychology": "心理学",
},
"zh": {
"biology": "生物学",
"business": "商业",
"chemistry": "化学",
"computer_science": "计算机科学",
"economics": "经济学",
"engineering": "工程学",
"health": "健康",
"history": "历史",
"law": "法律",
"math": "数学",
"other": "其他",
"philosophy": "哲学",
"physics": "物理学",
"psychology": "心理学",
},
"ko": {
"biology": "생물학",
"business": "경영학",
"chemistry": "화학",
"computer_science": "컴퓨터 과학",
"economics": "경제학",
"engineering": "공학",
"health": "건강",
"history": "역사",
"law": "법률",
"math": "수학",
"other": "기타",
"philosophy": "철학",
"physics": "물리학",
"psychology": "심리학",
},
"fr": {
"biology": "biologie",
"business": "commerce",
"chemistry": "chimie",
"computer_science": "informatique",
"economics": "économie",
"engineering": "ingénierie",
"health": "santé",
"history": "histoire",
"law": "droit",
"math": "mathématiques",
"other": "autre",
"philosophy": "philosophie",
"physics": "physique",
"psychology": "psychologie",
},
"de": {
"biology": "Biologie",
"business": "Wirtschaft",
"chemistry": "Chemie",
"computer_science": "Informatik",
"economics": "Ökonomie",
"engineering": "Ingenieurwesen",
"health": "Gesundheit",
"history": "Geschichte",
"law": "Recht",
"math": "Mathematik",
"other": "Sonstiges",
"philosophy": "Philosophie",
"physics": "Physik",
"psychology": "Psychologie",
},
"es": {
"biology": "biología",
"business": "negocios",
"chemistry": "química",
"computer_science": "informática",
"economics": "economía",
"engineering": "ingeniería",
"health": "salud",
"history": "historia",
"law": "derecho",
"math": "matemáticas",
"other": "otro",
"philosophy": "filosofía",
"physics": "física",
"psychology": "psicología",
},
"pt": {
"biology": "biologia",
"business": "negócios",
"chemistry": "química",
"computer_science": "ciência da computação",
"economics": "economia",
"engineering": "engenharia",
"health": "saúde",
"history": "história",
"law": "direito",
"math": "matemática",
"other": "outro",
"philosophy": "filosofia",
"physics": "física",
"psychology": "psicologia",
},
"sw": {
"biology": "biolojia",
"business": "biashara",
"chemistry": "kemia",
"computer_science": "sayansi ya kompyuta",
"economics": "uchumi",
"engineering": "uhandisi",
"health": "afya",
"history": "historia",
"law": "sheria",
"math": "hisabati",
"other": "nyingine",
"philosophy": "falsafa",
"physics": "fizikia",
"psychology": "saikolojia",
},
"th": {
"biology": "ชีววิทยา",
"business": "ธุรกิจ",
"chemistry": "เคมี",
"computer_science": "วิทยาการคอมพิวเตอร์",
"economics": "เศรษฐศาสตร์",
"engineering": "วิศวกรรมศาสตร์",
"health": "สุขภาพ",
"history": "ประวัติศาสตร์",
"law": "กฎหมาย",
"math": "คณิตศาสตร์",
"other": "อื่นๆ",
"philosophy": "ปรัชญา",
"physics": "ฟิสิกส์",
"psychology": "จิตวิทยา",
},
"ar": {
"biology": "علم الأحياء",
"business": "الأعمال",
"chemistry": "الكيمياء",
"computer_science": "علوم الكمبيوتر",
"economics": "الاقتصاد",
"engineering": "الهندسة",
"health": "الصحة",
"history": "التاريخ",
"law": "القانون",
"math": "الرياضيات",
"other": "أخرى",
"philosophy": "الفلسفة",
"physics": "الفيزياء",
"psychology": "علم النفس",
},
"hi": {
"biology": "जीव विज्ञान",
"business": "व्यापार",
"chemistry": "रसायन विज्ञान",
"computer_science": "कंप्यूटर विज्ञान",
"economics": "अर्थशास्त्र",
"engineering": "इंजीनियरिंग",
"health": "स्वास्थ्य",
"history": "इतिहास",
"law": "कानून",
"math": "गणित",
"other": "अन्य",
"philosophy": "दर्शनशास्त्र",
"physics": "भौतिकी",
"psychology": "मनोविज्ञान",
},
"bn": {
"biology": "জীববিজ্ঞান",
"business": "ব্যবসা",
"chemistry": "রসায়ন",
"computer_science": "কম্পিউটার বিজ্ঞান",
"economics": "অর্থনীতি",
"engineering": "প্রকৌশল",
"health": "স্বাস্থ্য",
"history": "ইতিহাস",
"law": "আইন",
"math": "গণিত",
"other": "অন্যান্য",
"philosophy": "দর্শন",
"physics": "পদার্থবিজ্ঞান",
"psychology": "মনোবিজ্ঞান",
},
}
import os
import shutil
import yaml
from lang_libs import LANG_LIBS, LANG_SUBJECTS
language_word_to_abbr = {
"English": "en",
"Japanese": "ja",
"Chinese": "zh",
"Korean": "ko",
"French": "fr",
"German": "de",
"Spanish": "es",
"Portuguese": "pt",
"Swahili": "sw",
"Thai": "th",
"Arabic": "ar",
"Hindi": "hi",
"Bengali": "bn",
}
language_abbr_to_word = {v: k for k, v in language_word_to_abbr.items()}
if __name__ == "__main__":
mmlu_pro_config_dir = "../mmlu_pro"
mmlu_prox_repo_id = "li-lab/MMLU-ProX"
for lang_abbr in language_abbr_to_word:
os.makedirs(lang_abbr, exist_ok=True)
lang_lib_list = LANG_LIBS[lang_abbr]
lang_sbj_dict = LANG_SUBJECTS[lang_abbr]
with (
open("template/_lang_template_yaml", "r") as reader,
open(f"{lang_abbr}/_{lang_abbr}_template_yaml", "w") as writer,
):
for line in reader.readlines():
if "{repo_id}" in line:
line = line.format(repo_id=mmlu_prox_repo_id)
if "{lang}" in line:
line = line.format(lang=lang_abbr)
if "{ans_regex}" in line:
ans_regex = lang_lib_list[-1].replace(
"({})", "\(?([ABCDEFGHIJ])\)?"
)
if lang_abbr == "en":
ans_regex = ans_regex.lstrip("the").strip()
line = line.format(ans_regex=ans_regex)
if "{que_prefix}" in line:
line = line.format(que_prefix=lang_lib_list[0])
writer.write(line)
shutil.copy("template/utils.py", f"{lang_abbr}/utils.py")
group_name = f"mmlu_prox_{lang_abbr}"
group_dict = dict(
group=group_name,
task=[f"{group_name}_{sbj}" for sbj in LANG_SUBJECTS[lang_abbr]],
aggregate_metric_list=[
dict(
aggregation="mean",
metric="exact_match",
weight_by_size=True,
filter_list="custom-extract",
)
],
metadata=dict(version=0.0),
)
with open(f"{lang_abbr}/_{group_name}.yaml", "w", encoding="utf-8") as f:
yaml.dump(
group_dict,
f,
default_flow_style=False,
allow_unicode=True,
sort_keys=False,
)
for sbj in lang_sbj_dict:
with open(
f"{mmlu_pro_config_dir}/mmlu_pro_{sbj}.yaml", "r", encoding="utf-8"
) as f:
sbj_yaml_last_line = None
for line in f.readlines():
if line.startswith("process_docs:"):
sbj_yaml_last_line = line.strip()
sbj_dict = dict(
description=lang_lib_list[3].format(
subject=lang_sbj_dict[sbj], ans_suffix=lang_lib_list[5].format("X")
)
+ "\n",
include=f"_{lang_abbr}_template_yaml",
task=f"{group_name}_{sbj}",
task_alias=sbj,
)
with open(
f"{lang_abbr}/{group_name}_{sbj}.yaml", "w", encoding="utf-8"
) as f:
yaml.dump(
sbj_dict,
f,
default_flow_style=False,
allow_unicode=True,
sort_keys=False,
)
with open(
f"{lang_abbr}/{group_name}_{sbj}.yaml", "a", encoding="utf-8"
) as f:
f.write(sbj_yaml_last_line + "\n")
print(f"Finished {lang_abbr}")
group: mmlu_prox_pt
task:
- mmlu_prox_pt_biology
- mmlu_prox_pt_business
- mmlu_prox_pt_chemistry
- mmlu_prox_pt_computer_science
- mmlu_prox_pt_economics
- mmlu_prox_pt_engineering
- mmlu_prox_pt_health
- mmlu_prox_pt_history
- mmlu_prox_pt_law
- mmlu_prox_pt_math
- mmlu_prox_pt_other
- mmlu_prox_pt_philosophy
- mmlu_prox_pt_physics
- mmlu_prox_pt_psychology
aggregate_metric_list:
- aggregation: mean
metric: exact_match
weight_by_size: true
filter_list: custom-extract
metadata:
version: 0.0
dataset_path: li-lab/MMLU-ProX
dataset_name: pt
test_split: test
fewshot_split: validation
fewshot_config:
sampler: first_n
doc_to_text: !function utils.fewshot_to_text
doc_to_target: ""
output_type: generate_until
doc_to_text: !function utils.doc_to_text
doc_to_target: answer
filter_list:
- name: "custom-extract"
filter:
- function: "regex"
regex_pattern: 'A resposta é \(?([ABCDEFGHIJ])\)?'
- function: "take_first"
generation_kwargs:
until:
- "</s>"
- "Q:"
- "Pergunta:"
- "<|im_end|>"
do_sample: false
temperature: 0.0
max_gen_toks: 2048
num_fewshot: 5
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
metadata:
version: 0.0
description: 'A seguir estão perguntas de múltipla escolha (com respostas) sobre biologia.
Pense passo a passo e termine sua resposta com "A resposta é (X)" onde X é a letra
da opção correta.
'
include: _pt_template_yaml
task: mmlu_prox_pt_biology
task_alias: biology
process_docs: !function utils.process_biology
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment