{"danish":{"source1":"2 . Udfyld felterne i hvert trin i vejledningen . ","source2":"* Vise rapporter med finansposter og saldi . ","target1":"2 . Fill in the fields in each step of the guide . ","target2":"* View reports that show general ledger entries and balances . "},"chinese":{"source1":"返回 与 筛选器 初始化 由 平台 的 MCDRemoteSystemPlatformFilter 对象 。 ","source2":"用于 将 本地 的 ( 调用 ) 应用 程序 可 见性 首选 项 设置 发现 远程 系统 时 的 类 。 ","target1":"Returns an MCDRemoteSystemPlatformFilter object initialized with a filter by platform . ","target2":"A class used to set the local ( calling ) application visibility preference when discovering remote systems ."},"norwegian":{"source1":"Kosttypesaldo = Kostsentersaldo + Kostobjektsaldo ","source2":"* Vise en liste over bokføringsgrupper som du posterer til kontoen . ","target1":"Cost Type Balance = Cost Center Balance + Cost Object Balance ","target2":"* See a list of posting groups that post to that account . "},"latvian":{"source1":"# # < a name = " 6-change-the-status-of-the-conversion-record-to-ready " > < / a > 6 . Mainiet pārveidošanas ieraksta statusu uz Gatavs ","source2":"title : Preču saņemšanas reģistrēšana pirkšanas pasūtījumā ","target1":"# # 6 . Change the status of the conversion record to Ready ","target2":"title : Record the receipt of goods on the purchase order "}}
{"instruction1":"convert a list of integers into a single integer","instruction2":"how to convert a datetime string back to datetime object?","solution1":"r = int(''.join(map(str, x)))","solution2":"datetime.datetime.strptime(str, '%m/%d/%Y')"}
{"instruction1":"get the distance of map coordinates to the center ","instruction2":"check if details are parsed","solution1":"float function ( int arg0 , int arg1 ) { int loc0 = arg0 - cx ; int loc1 = arg1 - cy ; return getSquaredDistance ( loc0 , loc1 ) ; }","solution2":"boolean function ( ) { return isParsed ; }"}
"questions":["Olivia has $23. She bought five bagels for $3 each. How much money does she have left?",
"Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?",
"There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?",
"Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?",
"Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?",
"Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?",
"If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?",
"There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?"],
author={Gao, Luyu and Madaan, Aman and Zhou, Shuyan and Alon, Uri and Liu, Pengfei and Yang, Yiming and Callan, Jamie and Neubig, Graham},
journal={arXiv preprint arXiv:2211.10435},
year={2022}
}
@article{cobbe2021gsm8k,
title={Training Verifiers to Solve Math Word Problems},
author={Cobbe, Karl and Kosaraju, Vineet and Bavarian, Mohammad and Chen, Mark and Jun, Heewoo and Kaiser, Lukasz and Plappert, Matthias and Tworek, Jerry and Hilton, Jacob and Nakano, Reiichiro and Hesse, Christopher and Schulman, John},
journal={arXiv preprint arXiv:2110.14168},
year={2021}
}
"""
# Number of few shot examples to consider
NUM_SHOTS=8
classEvaluationType(str,Enum):
"""Possible values for evaluation type argument"""
GREEDY="greedy"
MAJORITY_VOTING="majority_voting"
defcreate_all_tasks():
"""Creates a dictionary of tasks for all evalution type
:return: {task_name: task}
e.g. {pal-gsm8k-greedy: Task, pal-gsm8k-majority_voting: Task}
title={Evaluating Large Language Models Trained on Code},
author={Mark Chen and Jerry Tworek and Heewoo Jun and Qiming Yuan and Henrique Ponde de Oliveira Pinto and Jared Kaplan and Harri Edwards and Yuri Burda and Nicholas Joseph and Greg Brockman and Alex Ray and Raul Puri and Gretchen Krueger and Michael Petrov and Heidy Khlaaf and Girish Sastry and Pamela Mishkin and Brooke Chan and Scott Gray and Nick Ryder and Mikhail Pavlov and Alethea Power and Lukasz Kaiser and Mohammad Bavarian and Clemens Winter and Philippe Tillet and Felipe Petroski Such and Dave Cummings and Matthias Plappert and Fotios Chantzis and Elizabeth Barnes and Ariel Herbert-Voss and William Hebgen Guss and Alex Nichol and Alex Paino and Nikolas Tezak and Jie Tang and Igor Babuschkin and Suchir Balaji and Shantanu Jain and William Saunders and Christopher Hesse and Andrew N. Carr and Jan Leike and Josh Achiam and Vedant Misra and Evan Morikawa and Alec Radford and Matthew Knight and Miles Brundage and Mira Murati and Katie Mayer and Peter Welinder and Bob McGrew and Dario Amodei and Sam McCandlish and Ilya Sutskever and Wojciech Zaremba},
year={2021},
eprint={2107.03374},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
"""
defcreate_all_tasks():
"""Creates a dictionary of tasks from a list of levels
title={OctoPack: Instruction Tuning Code Large Language Models},
author={Niklas Muennighoff and Qian Liu and Armel Zebaze and Qinkai Zheng and Binyuan Hui and Terry Yue Zhuo and Swayam Singh and Xiangru Tang and Leandro von Werra and Shayne Longpre},
prompt=f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{inp}\n\n### Response:{prompt_base}'
prompt=f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{inp}\n\n### Response:\n{prompt_base}'
prompt=f"<issue_start>username_0: {instruction}\n\n```{context}```\nUpvotes: 100<issue_comment>username_1: Sure, here is the fixed code.\n\n```{prompt_base}"
title={OctoPack: Instruction Tuning Code Large Language Models},
author={Niklas Muennighoff and Qian Liu and Armel Zebaze and Qinkai Zheng and Binyuan Hui and Terry Yue Zhuo and Swayam Singh and Xiangru Tang and Leandro von Werra and Shayne Longpre},
title={Evaluating Large Language Models Trained on Code},
author={Mark Chen and Jerry Tworek and Heewoo Jun and Qiming Yuan and Henrique Ponde de Oliveira Pinto and Jared Kaplan and Harri Edwards and Yuri Burda and Nicholas Joseph and Greg Brockman and Alex Ray and Raul Puri and Gretchen Krueger and Michael Petrov and Heidy Khlaaf and Girish Sastry and Pamela Mishkin and Brooke Chan and Scott Gray and Nick Ryder and Mikhail Pavlov and Alethea Power and Lukasz Kaiser and Mohammad Bavarian and Clemens Winter and Philippe Tillet and Felipe Petroski Such and Dave Cummings and Matthias Plappert and Fotios Chantzis and Elizabeth Barnes and Ariel Herbert-Voss and William Hebgen Guss and Alex Nichol and Alex Paino and Nikolas Tezak and Jie Tang and Igor Babuschkin and Suchir Balaji and Shantanu Jain and William Saunders and Christopher Hesse and Andrew N. Carr and Jan Leike and Josh Achiam and Vedant Misra and Evan Morikawa and Alec Radford and Matthew Knight and Miles Brundage and Mira Murati and Katie Mayer and Peter Welinder and Bob McGrew and Dario Amodei and Sam McCandlish and Ilya Sutskever and Wojciech Zaremba},
year={2021},
eprint={2107.03374},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
"""
defgenerate_prompt(input):
INSTRUCTION=f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Create a Python script for this problem:
{input}
### Response:"""
returnINSTRUCTION
classHumanEvalWizardCoder(Task):
"""A task represents an entire benchmark including its dataset, problems,
answers, generation settings and evaluation methods.
"""
DATASET_PATH="openai_humaneval"
def__init__(self):
super().__init__(
stop_words=[],
requires_execution=True,
)
defget_dataset(self):
"""Returns dataset for the task or an iterable of any object, that get_prompt can handle"""
returnself.dataset["test"]
defget_prompt(self,doc):
"""Builds the prompt for the LM to generate from."""
prompt=doc["prompt"].replace(" ","\t")
prompt=generate_prompt(prompt)
returnprompt
defget_reference(self,doc):
"""Builds the reference solution for the doc (sample from the test dataset)."""
test_func=doc["test"]
entry_point=f"check({doc['entry_point']})"
return"\n"+test_func+"\n"+entry_point
@staticmethod
defclean_comp(completion):
# adapted from https://github.com/nlpxucan/WizardLM/blob/main/WizardCoder/src/process_humaneval.py
title={Program Synthesis with Large Language Models},
author={Austin, Jacob and Odena, Augustus and Nye, Maxwell and Bosma, Maarten and Michalewski, Henryk and Dohan, David and Jiang, Ellen and Cai, Carrie and Terry, Michael and Le, Quoc and others},
journal={arXiv preprint arXiv:2108.07732},
year={2021}
}
"""
classMBPP(Task):
"""A task represents an entire benchmark including its dataset, problems,
answers, generation settings and evaluation methods.
title={A Scalable and Extensible Approach to Benchmarking NL2Code for 18 Programming Languages},
author={Cassano, Federico and Gouwar, John and Nguyen, Daniel and Nguyen, Sydney and Phipps-Costin, Luna and Pinckney, Donald and Yee, Ming Ho and Zi, Yangtian and Anderson, Carolyn Jane and Feldman, Molly Q and others},
journal={arXiv preprint arXiv:2208.08227},
year={2022}
}
"""
LANGUAGES=[
"py",
"sh",
"cpp",
"cs",
"d",
"go",
"java",
"js",
"jl",
"lua",
"pl",
"php",
"r",
"rkt",
"rb",
"rs",
"scala",
"swift",
"ts",
]
defcreate_all_tasks():
"""Creates a dictionary of tasks from a list of levels
title={QuixBugs: A multi-lingual program repair benchmark set based on the Quixey Challenge},
author={Lin, Derrick and Koppel, James and Chen, Angela and Solar-Lezama, Armando},
booktitle={Proceedings Companion of the 2017 ACM SIGPLAN international conference on systems, programming, languages, and applications: software for humanity},
pages={55--56},
year={2017}
}
"""
classQuixBugs(Task):
DATASET_PATH="Muennighoff/quixbugs"
def__init__(self,prompt="prompt"):
self.prompt=prompt
ifself.prompt=="edit":
self.stop_words=[
"<commit_before>",
"<commit_msg>",
"<commit_after>",
"<|endoftext|>",
]
elifself.prompt.startswith("prompt"):
self.stop_words=[
"\ndef",
"\nclass",
"\n#",
"\n@",
"\nprint",
"\nif",
"###",
"///",
"<|endoftext|>",
]
elifself.prompt.startswith("prompt_codex"):
# https://arxiv.org/pdf/2111.03922.pdf
self.stop_words=[
"\nclass","###","///","<|endoftext|>",
]
else:
raiseValueError(f"Unknown prompt: {self.prompt}")
super().__init__(
stop_words=self.stop_words,
requires_execution=True,
)
self.max_length_multiplier=3# Allow 3 times the length of the prompt
defget_dataset(self):
"""Returns dataset for the task or an iterable of any object, that get_prompt can handle"""
returnself.dataset["train"]
defget_prompt(self,doc):
"""Builds the prompt for the LM to generate from."""
ifself.prompt=="edit":
prompt="<commit_before>"+doc["buggy_program"]
prompt+="<commit_msg>"+"Fix bug in "+doc["name"]
prompt+="<commit_after>"
elifself.prompt=="edit-openai":
returndoc["buggy_program"],"Fix bug in "+doc["name"]
elifself.prompt=="prompt":
prompt="# Buggy function"
prompt+="\n"+doc["buggy_program"]+"\n"
prompt+="# Fixed function\ndef"
elifself.prompt=="prompt_codex":
# https://arxiv.org/pdf/2111.03922.pdf, Prenner et al.
prompt="### fix the bug in the following function"
prompt+="\n"+doc["buggy_program"]+"\n"
prompt+="### fixed function"
else:
raiseValueError(f"Unknown prompt: {prompt}")
returnprompt.strip()
defget_reference(self,doc):
"""Builds the reference solution for the doc (sample from the test dataset)."""
return(doc["name"],doc["tests"].strip())
@staticmethod
defremove_last_block(string,stop_words):
stop_words=[re.escape(word)forwordinstop_words]# Escape e.g. | in <|endoftext|>
# Remove the last block of the code containing stop_words for HumanEval
title={ReCode: Robustness Evaluation of Code Generation Models},
author={Wang, Shiqi and Li, Zheng and Qian, Haifeng and Yang, Chenghao and Wang, Zijian and Shang, Mingyue and Kumar, Varun and Tan, Samson and Ray, Baishakhi and Bhatia, Parminder and others},
author={Allal, Loubna Ben and Li, Raymond and Kocetkov, Denis and Mou, Chenghao and Akiki, Christopher and Ferrandis, Carlos Munoz and Muennighoff, Niklas and Mishra, Mayank and Gu, Alex and Dey, Manan and others},