@@ -97,7 +97,7 @@ We would like to mention that the evaluation of model answers using the GPT-3.5
...
@@ -97,7 +97,7 @@ We would like to mention that the evaluation of model answers using the GPT-3.5
## Data Format
## Data Format
### Questions
### Questions
The file [questions.json](./sample/questions.json) shows the example questions used to evaluate the performance of the model. The current sample questions are collected from [FastChat](https://github.com/lm-sys/FastChat/blob/main/fastchat/eval/table/question.jsonl). Each question record has the following field:
The file [questions.json](./sample/questions.json) shows the example questions used to evaluate the performance of the model. Each question record has the following field:
*`id` (id, compulsory): The ID of the instruction / question.
*`id` (id, compulsory): The ID of the instruction / question.
*`instruction` (str, compulsory): The instruction / question for the LLM.
*`instruction` (str, compulsory): The instruction / question for the LLM.
*`input` (str, optional): The additional context of the instruction / question.
*`input` (str, optional): The additional context of the instruction / question.
"instruction":"How can I improve my time management skills?",
"instruction":"Help me summarize the following news?",
"input":"",
"input":"National Commercial Bank (NCB), Saudi Arabia's largest lender by assets, agreed to buy rival Samba Financial Group for $15 billion in the biggest banking takeover this year.NCB will pay 28.45 riyals ($7.58) for each Samba share, according to a statement on Sunday, valuing it at about 55.7 billion riyals. NCB will offer 0.739 new shares for each Samba share, at the lower end of the 0.736-0.787 ratio the banks set when they signed an initial framework agreement in June.The offer is a 3.5% premium to Samba's Oct. 8 closing price of 27.50 riyals and about 24% higher than the level the shares traded at before the talks were made public. Bloomberg News first reported the merger discussions.The new bank will have total assets of more than $220 billion, creating the Gulf region's third-largest lender. The entity's $46 billion market capitalization nearly matches that of Qatar National Bank QPSC, which is still the Middle East's biggest lender with about $268 billion of assets.",
"output":"",
"output":"NCB to pay 28.45 riyals for each Samba share. Deal will create Gulf region's third-largest lender",
"id":1
"category":"closed qa"
},
{
"category":"generic",
"instruction":"What are the most effective ways to deal with stress?",
"input":"",
"output":"",
"id":2
},
{
"category":"generic",
"instruction":"What are the main differences between Python and JavaScript programming languages?",
"input":"",
"output":"",
"id":3
},
{
"category":"generic",
"instruction":"How can I increase my productivity while working from home?",
"input":"",
"output":"",
"id":4
},
{
"category":"generic",
"instruction":"Can you explain the basics of quantum computing?",
"input":"",
"output":"",
"id":5
},
{
"category":"generic",
"instruction":"What are the differences between plant-based and animal-based protein sources?",
"input":"",
"output":"",
"id":6
},
{
"category":"generic",
"instruction":"How can I develop my critical thinking skills?",
"input":"",
"output":"",
"id":7
},
{
"category":"generic",
"instruction":"What are the major challenges faced by the education sector today?",
"input":"",
"output":"",
"id":8
},
{
"category":"generic",
"instruction":"What are the primary factors that influence consumer behavior?",
"input":"",
"output":"",
"id":9
},
{
"category":"generic",
"instruction":"What are the most effective strategies for conflict resolution in the workplace?",
"input":"",
"output":"",
"id":10
},
{
"category":"knowledge",
"instruction":"What are some potential implications of using a single-use plastic bottle versus a reusable bottle on both the environment and human health?",
"input":"",
"output":"",
"id":11
},
{
"category":"knowledge",
"instruction":"What factors would you consider when designing an inclusive and accessible public transportation system?",
"input":"",
"output":"",
"id":12
},
{
"category":"knowledge",
"instruction":"How can governments utilize fiscal and monetary policies to combat economic recessions?",
"input":"",
"output":"",
"id":13
},
{
"category":"knowledge",
"instruction":"How do language and cultural barriers affect the way people communicate and form relationships in multicultural societies?",
"input":"",
"output":"",
"id":14
},
{
"category":"knowledge",
"instruction":"Describe a scenario where artificial intelligence could be used to improve the quality and efficiency of healthcare delivery.",
"input":"",
"output":"",
"id":15
},
{
"category":"knowledge",
"instruction":"Explain the process of gene editing using CRISPR-Cas9 technology, and discuss its potential applications and ethical implications.",
"input":"",
"output":"",
"id":16
},
{
"category":"knowledge",
"instruction":"How do vaccinations work to protect individuals and communities from infectious diseases, and what is herd immunity?",
"input":"",
"output":"",
"id":17
},
{
"category":"knowledge",
"instruction":"How do social media platforms influence the way people consume and share news, and what are the potential implications for the spread of misinformation?",
"input":"",
"output":"",
"id":18
},
{
"category":"knowledge",
"instruction":"How do cultural, social, and economic factors influence people's food choices, and how can this knowledge be used to promote healthier diets?",
"input":"",
"output":"",
"id":19
},
{
"category":"knowledge",
"instruction":"Explain the process of natural selection and how it contributes to the evolution and adaptation of species.",
"input":"",
"output":"",
"id":20
},
{
"category":"roleplay",
"instruction":"How would you introduce yourself as a medieval knight at a royal banquet?",
"input":"",
"output":"",
"id":21
},
{
"category":"roleplay",
"instruction":"As a pirate captain, what would you say to your crew to motivate them to search for hidden treasure?",
"input":"",
"output":"",
"id":22
},
{
"category":"roleplay",
"instruction":"If you were a Shakespearean character, how would you declare your love for someone in a soliloquy?",
"input":"",
"output":"",
"id":23
},
{
"category":"roleplay",
"instruction":"As a superhero, how would you explain your origin story to a curious child?",
"input":"",
"output":"",
"id":24
},
{
"category":"roleplay",
"instruction":"Imagine you are a time traveler from the year 3000. What technological advancements would you tell people about?",
"input":"",
"output":"",
"id":25
},
{
"category":"roleplay",
"instruction":"As a sports commentator, describe the winning play in the final seconds of a championship game.",
"input":"",
"output":"",
"id":26
},
{
"category":"roleplay",
"instruction":"Pretend to be a world-famous chef. How would you describe your signature dish to a panel of judges?",
"input":"",
"output":"",
"id":27
},
{
"category":"roleplay",
"instruction":"You are a mountain climber reaching the summit of Mount Everest. Describe your emotions and the view from the top.",
"input":"",
"output":"",
"id":28
},
{
"category":"roleplay",
"instruction":"As a space colonist on Mars, describe your daily life and the challenges you face living on another planet.",
"input":"",
"output":"",
"id":29
},
{
"category":"roleplay",
"instruction":"Pretend to be a character in a post-apocalyptic world. Describe how you survive and the allies you encounter.",
"input":"",
"output":"",
"id":30
},
{
"category":"common-sense",
"instruction":"How can you determine if a restaurant is popular among locals or mainly attracts tourists, and why might this information be useful?",
"input":"",
"output":"",
"id":31
},
{
"category":"common-sense",
"instruction":"What are some subtle clues that suggest someone is pretending to understand a topic or conversation when they are actually confused or uninformed?",
"input":"",
"output":"",
"id":32
},
{
"category":"common-sense",
"instruction":"Why might someone choose to use a paper map or ask for directions instead of relying on a GPS device or smartphone app?",
"input":"",
"output":"",
"id":33
},
{
"category":"common-sense",
"instruction":"How can you determine if a person is genuinely interested in a conversation or simply being polite?",
"input":"",
"output":"",
"id":34
},
{
"category":"common-sense",
"instruction":"Why might someone prefer to shop at a small, locally-owned business instead of a large chain store, even if the prices are higher?",
"input":"",
"output":"",
"id":35
},
{
"category":"common-sense",
"instruction":"How can you assess the credibility of a source of information, such as a news article or blog post, without relying solely on the reputation of the author or publisher?",
"input":"",
"output":"",
"id":36
},
{
"category":"common-sense",
"instruction":"Why do some people enjoy the sensation of being scared, such as by watching horror movies or going on roller coasters, while others avoid these experiences?",
"input":"",
"output":"",
"id":37
},
{
"category":"common-sense",
"instruction":"How can observing the behavior of other people in a social situation provide clues about cultural norms and expectations?",
"input":"",
"output":"",
"id":38
},
{
"category":"common-sense",
"instruction":"Do we have a moral obligation to explore space, or should we focus on solving Earth's problems first?",
"input":"",
"output":"",
"id":39
},
{
"category":"common-sense",
"instruction":"In a world where automation is becoming increasingly prevalent, is it more important to prioritize job creation or technological progress?",
"input":"",
"output":"",
"id":40
},
{
"category":"fermi",
"instruction":"How many times does the average human blink in a lifetime? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":41
},
{
"category":"fermi",
"instruction":"How many atoms are in a grain of salt? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":42
},
{
"category":"fermi",
"instruction":"How many lightning strikes occur on Earth each day? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":43
},
{
"category":"fermi",
"instruction":"How many balloons would it take to lift a house like in the movie \"Up\"? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":44
},
{
"category":"fermi",
"instruction":"How many text messages are sent globally in a minute? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":45
},
{
"category":"fermi",
"instruction":"How many words are spoken daily on Earth? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":46
},
{
"category":"fermi",
"instruction":"How many snowflakes fall during a typical winter? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":47
},
{
"category":"fermi",
"instruction":"How many pages are in all the books ever written? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":48
},
{
"category":"fermi",
"instruction":"How many times has the Earth orbited the Sun since the beginning of life? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":49
},
{
"category":"fermi",
"instruction":"How many songs have been recorded throughout history? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.",
"input":"",
"output":"",
"id":50
},
{
"category":"counterfactual",
"instruction":"What if the Internet had been invented during the Renaissance period?",
"input":"",
"output":"",
"id":51
},
{
"category":"counterfactual",
"instruction":"What if the Aztecs had successfully repelled the Spanish conquistadors?",
"input":"",
"output":"",
"id":52
},
{
"category":"counterfactual",
"instruction":"What if the Black Death had not occurred in the 14th century?",
"input":"",
"output":"",
"id":53
},
{
"category":"counterfactual",
"instruction":"What if Isaac Newton had focused on biology instead of physics?",
"input":"",
"output":"",
"id":54
},
{
"category":"counterfactual",
"instruction":"What if the Beatles had never formed as a band?",
"input":"",
"output":"",
"id":55
},
{
"category":"counterfactual",
"instruction":"What if Alan Turing had not cracked the Enigma code during World War II?",
"input":"",
"output":"",
"id":56
},
{
"category":"counterfactual",
"instruction":"What if the Suez Canal had never been constructed?",
"input":"",
"output":"",
"id":57
},
{
"category":"counterfactual",
"instruction":"What if the Maya civilization had never mysteriously collapsed?",
"input":"",
"output":"",
"id":58
},
{
"category":"counterfactual",
"instruction":"What if Christopher Columbus had not discovered the Americas?",
"input":"",
"output":"",
"id":59
},
{
"category":"counterfactual",
"instruction":"What if Vincent van Gogh had been a successful artist during his lifetime?",
"input":"",
"output":"",
"id":60
},
{
"category":"coding",
"instruction":"Develop a C++ program that reads a text file line by line and counts the number of occurrences of a specific word in the file.",
"input":"",
"output":"",
"id":61
},
{
"category":"coding",
"instruction":"Implement a Python function to find the longest common subsequence of two input strings using dynamic programming.",
"input":"",
"output":"",
"id":62
},
{
"category":"coding",
"instruction":"Implement a regular expression in Python to validate an email address.",
"input":"",
"output":"",
"id":63
},
{
"category":"coding",
"instruction":"Write a program to find the nth Fibonacci number using dynamic programming.",
"input":"",
"output":"",
"id":64
},
{
"category":"coding",
"instruction":"Implement a binary search algorithm to find a specific element in a sorted array.",
"input":"",
"output":"",
"id":65
},
{
"category":"coding",
"instruction":"Implement a queue data structure using two stacks in Python.",
"input":"",
"output":"",
"id":66
},
{
"category":"coding",
"instruction":"Implement a program to find the common elements in two arrays without using any extra data structures.",
"input":"",
"output":"",
"id":67
},
{
"category":"math",
"instruction":"Given that f(x) = 5x^3 - 2x + 3, find the value of f(2).",
"input":"",
"output":"",
"id":68
},
{
"category":"math",
"instruction":"Solve for x in the equation 3x + 10 = 5(x - 2).",
"input":"",
"output":"",
"id":69
},
{
"category":"math",
"instruction":"If the endpoints of a line segment are (2, -2) and (10, 4), what is the length of the segment?",
"input":"",
"output":"",
"id":70
},
{
"category":"writing",
"instruction":"Can you help me write a formal email to a potential business partner proposing a joint venture?",
"input":"",
"output":"",
"id":71
},
{
"category":"writing",
"instruction":"Can you help me write a resignation letter to my current employer, while leaving on good terms and expressing gratitude for the opportunities provided?",
"input":"",
"output":"",
"id":72
},
{
"category":"writing",
"instruction":"Use an appropriate format to structure a formal letter of recommendation for a student applying to a prestigious graduate program in computer science.",
"input":"",
"output":"",
"id":73
},
{
"category":"writing",
"instruction":"Write a compelling product launch announcement email to inform our customers of our new software solution.",
"input":"",
"output":"",
"id":74
},
{
"category":"writing",
"instruction":"Draft an apology email to a customer who experienced a delay in their order, and provide reassurance that the issue has been resolved.",
"input":"",
"output":"",
"id":75
},
{
"category":"writing",
"instruction":"Write a script for a YouTube video exploring the history and cultural significance of jazz.",
"input":"",
"output":"",
"id":76
},
{
"category":"writing",
"instruction":"Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.",
"input":"",
"output":"",
"id":77
},
{
"category":"writing",
"instruction":"Write a captivating movie review for a recently released science fiction film, discussing its plot, characters, and special effects.",
"input":"",
"output":"",
"id":78
},
{
"category":"writing",
"instruction":"Structure a podcast script for an episode discussing the influence of streaming platforms on the music industry.",
"input":"",
"output":"",
"id":79
},
{
"category":"writing",
"instruction":"Write a symphony concert review, discussing the orchestra's performance and overall audience experience.",