DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
https://arxiv.org/pdf/2211.11501.pdf
DS-1000 is a code generation benchmark with a thousand data science questions spanning seven Python libraries that (1) reflects diverse, realistic, and practical use cases, (2) has a reliable metric, (3) defends against memorization by perturbing questions.
Homepage: https://ds1000-code-gen.github.io/
"""
importfcntl
importfunctools
importio
importitertools
importpathlib
importwarnings
importzipfile
importrequests
importtqdm
frombigcode_eval.baseimportTask
_CITATION="""
@article{Lai2022DS1000,
title={DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation},
author={Yuhang Lai and Chengxi Li and Yiming Wang and Tianyi Zhang and Ruiqi Zhong and Luke Zettlemoyer and Scott Wen-tau Yih and Daniel Fried and Sida Wang and Tao Yu},