"git@developer.sourcefind.cn:sugon_wxj/megatron-lm.git" did not exist on "488f8c02a3c7fb6dbd1a1bb6cc410cf63e6794e3"
Unverified Commit 574e565a authored by Lintang Sutawika's avatar Lintang Sutawika Committed by GitHub
Browse files

Merge branch 'big-refactor' into verbosity-rework

parents 73f3029c b7a4ea06
dataset_name: high_school_statistics "dataset_name": "high_school_statistics"
description: 'The following are multiple choice questions (with answers) about high "description": "The following are multiple choice questions (with answers) about high\
school statistics. \ school statistics.\n\nQ: A new smartwatch is manufactured in one part of a factory,\
\ then secured for shipping in another, independent part of the factory. The weight\
\ of the smartwatch has a mean of 62 grams and a standard deviation of 1.0 grams.\
Q: A new smartwatch is manufactured in one part of a factory, then secured for shipping \ The weight of the packaging (box, user's guide, bubble wrap, etc.) has a mean\
in another, independent part of the factory. The weight of the smartwatch has a \ of 456 grams and a standard deviation of 6 grams. Together, the distribution of\
mean of 62 grams and a standard deviation of 1.0 grams. The weight of the packaging \ the weight of the smartwatch and its packaging would have the following mean and\
(box, user''s guide, bubble wrap, etc.) has a mean of 456 grams and a standard deviation \ standard deviation:\n(A) Mean 518 grams; standard deviation 7.0 grams (B) Mean\
of 6 grams. Together, the distribution of the weight of the smartwatch and its packaging \ 518 grams; standard deviation 3.5 grams (C) Mean 518 grams; standard deviation\
would have the following mean and standard deviation: \ 6.1 grams (D) Mean 394 grams; standard deviation 6.1 grams\nA: Let's think step\
\ by step. Since the weight of the watch and the weight of the packaging are independent\
(A) Mean 518 grams; standard deviation 7.0 grams (B) Mean 518 grams; standard deviation \ random variables, the mean and variance of their sum is equal to the sum of their\
3.5 grams (C) Mean 518 grams; standard deviation 6.1 grams (D) Mean 394 grams; standard \ individual means and variances. So the mean is 62 + 456 = 518 grams, and the variances\
deviation 6.1 grams \ is 1.0^2 + 6.0^2 = 37, leading to a standard deviation of 6.1 grams. The answer\
\ is (C).\n\nQ: After a frost warning was issued, the owner of a large orange grove\
A: Let''s think step by step. Since the weight of the watch and the weight of the \ asked his workers to spray all his trees with water. The water was supposed to\
packaging are independent random variables, the mean and variance of their sum is \ freeze and form a protective covering of ice around the orange blossom. Nevertheless,\
equal to the sum of their individual means and variances. So the mean is 62 + 456 \ the owner suspected that some trees suffered considerable damage due to the frost.\
= 518 grams, and the variances is 1.0^2 + 6.0^2 = 37, leading to a standard deviation \ To estimate the proportion of trees that suffered more than 50 percent damage\
of 6.1 grams. The answer is (C). \ due to the frost, he took a random sample of 100 trees from his grove. What is\
\ the response variable in this experiment?\n(A) The proportion of trees that suffered\
\ more than 50 percent damage due to frost. (B) The number of trees affected by\
Q: After a frost warning was issued, the owner of a large orange grove asked his \ the frost. (C) The number of trees sampled from the grove. (D) For each sampled\
workers to spray all his trees with water. The water was supposed to freeze and \ tree, whether it suffered more than 50 percent damage or at most 50 percent damage.\n\
form a protective covering of ice around the orange blossom. Nevertheless, the owner A: Let's think step by step. In this experiment, the response variable is what is\
suspected that some trees suffered considerable damage due to the frost. To estimate \ measured. For each tree, what is measured is whether or not it suffered more than\
the proportion of trees that suffered more than 50 percent damage due to the frost, \ 50 percent damage due to the frost. The answer is (D).\n\nQ: Suppose X and Y are\
he took a random sample of 100 trees from his grove. What is the response variable \ random variables with E(X) = 37, var(X) = 5, E(Y) = 62, and var(Y) = 12. What\
in this experiment? \ are the expected value and variance of the random variable X + Y?\n(A) E(X + Y)\
\ = 99, var(X + Y) = 8.5 (B) E(X + Y) = 99, var(X + Y) = 13 (C) E(X + Y) = 99, var(X\
(A) The proportion of trees that suffered more than 50 percent damage due to frost. \ + Y) = 17 (D) There is insufficient information to answer this question.\nA: Let's\
(B) The number of trees affected by the frost. (C) The number of trees sampled from \ think step by step. While means of sums of random variables add (regardless of\
the grove. (D) For each sampled tree, whether it suffered more than 50 percent damage \ whether the variables are independent) in order to determine the variance of a\
or at most 50 percent damage. \ sum of random variables, we need to know not just their individual variances but\
\ the covariance of the two variables, which is not given in this problem. The answer\
A: Let''s think step by step. In this experiment, the response variable is what \ is (D).\n\nQ: Which of the following sets has the smallest standard deviation?\
is measured. For each tree, what is measured is whether or not it suffered more \ Which has the largest?\nI: {1,2,3}\nII: {-10,10}\nIII: {100}\n(A) I, II (B) II,\
than 50 percent damage due to the frost. The answer is (D). \ III (C) III, I (D) III, II\nA: Let's think step by step. The variance of distribution\
\ I is the expected squared deviation from its mean (which is 2), so the variance\
\ is 2/3 . The variance of distribution II is 10^2 (because both elements are 10\
Q: Suppose X and Y are random variables with E(X) = 37, var(X) = 5, E(Y) = 62, and \ away from the mean of zero). The variance of distribution III is 0, since it has\
var(Y) = 12. What are the expected value and variance of the random variable X + \ a single entry. So distribution III has the smallest standard deviation and distribution\
Y? \ II has the largest. The answer is (D).\n\nQ: Which of the following is a correct\
\ statement about correlation?\n(A) If the slope of the regression line is exactly\
(A) E(X + Y) = 99, var(X + Y) = 8.5 (B) E(X + Y) = 99, var(X + Y) = 13 (C) E(X + \ 1, then the correlation is exactly 1. (B) If the correlation is 0, then the slope\
Y) = 99, var(X + Y) = 17 (D) There is insufficient information to answer this question. \ of the regression line is undefined. (C) Switching which variable is called x\
\ and which is called y changes the sign of the correlation. (D) The correlation\
A: Let''s think step by step. While means of sums of random variables add (regardless \ r is equal to the slope of the regression line when z-scores for the y-variable\
of whether the variables are independent) in order to determine the variance of \ are plotted against z-scores for the x-variable.\nA: Let's think step by step.\
a sum of random variables, we need to know not just their individual variances but \ Statement A is false because the slope of the regression line being exactly 1\
the covariance of the two variables, which is not given in this problem. The answer \ can occur even when the two variables are not perfectly correlated. Statement\
is (D). \ B is false because uncorrelated variables regression lines can have slope zero.\
\ Statement C is false because correlation is symmetric in the two random variables.\
\ The answer is (D)."
Q: Which of the following sets has the smallest standard deviation? Which has the "group": "mmlu_flan_cot_fewshot_stem"
largest? "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_high_school_statistics"
I: {1,2,3}
II: {-10,10}
III: {100}
(A) I, II (B) II, III (C) III, I (D) III, II
A: Let''s think step by step. The variance of distribution I is the expected squared
deviation from its mean (which is 2), so the variance is 2/3 . The variance of distribution
II is 10^2 (because both elements are 10 away from the mean of zero). The variance
of distribution III is 0, since it has a single entry. So distribution III has the
smallest standard deviation and distribution II has the largest. The answer is (D).
Q: Which of the following is a correct statement about correlation?
(A) If the slope of the regression line is exactly 1, then the correlation is exactly
1. (B) If the correlation is 0, then the slope of the regression line is undefined.
(C) Switching which variable is called x and which is called y changes the sign
of the correlation. (D) The correlation r is equal to the slope of the regression
line when z-scores for the y-variable are plotted against z-scores for the x-variable.
A: Let''s think step by step. Statement A is false because the slope of the regression
line being exactly 1 can occur even when the two variables are not perfectly correlated.
Statement B is false because uncorrelated variables regression lines can have slope
zero. Statement C is false because correlation is symmetric in the two random variables.
The answer is (D).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_high_school_statistics
dataset_name: high_school_us_history "dataset_name": "high_school_us_history"
description: "The following are multiple choice questions (with answers) about high\ "description": "The following are multiple choice questions (with answers) about high\
\ school us history.\n\nQ: This question refers to the following information.\n\ \ school us history.\n\nQ: This question refers to the following information.\n\
I come not to urge personal claims, nor to seek individual benefits; I appear as\ I come not to urge personal claims, nor to seek individual benefits; I appear as\
\ the advocate of those who cannot plead their own cause; I come as the friend of\ \ the advocate of those who cannot plead their own cause; I come as the friend of\
...@@ -8,126 +8,127 @@ description: "The following are multiple choice questions (with answers) about h ...@@ -8,126 +8,127 @@ description: "The following are multiple choice questions (with answers) about h
\ jails penetrate not your Halls of Legislation. I am the Hope of the poor crazed\ \ jails penetrate not your Halls of Legislation. I am the Hope of the poor crazed\
\ beings who pine in the cells, and stalls, and cages, and waste rooms of your poor-houses.\ \ beings who pine in the cells, and stalls, and cages, and waste rooms of your poor-houses.\
\ I am the Revelation of hundreds of wailing, suffering creatures, hidden in your\ \ I am the Revelation of hundreds of wailing, suffering creatures, hidden in your\
\ private dwellings, and in pens and cabins\u2014shut out, cut off from all healing\ \ private dwellings, and in pens and cabins—shut out, cut off from all healing influences,\
\ influences, from all mind-restoring cares.\u2026 Could their melancholy histories\ \ from all mind-restoring cares.… Could their melancholy histories be spread before\
\ be spread before you as revealed to my grieved spirit during the last three months,\ \ you as revealed to my grieved spirit during the last three months, how promptly,\
\ how promptly, how earnestly would you search out the most approved means of relief;\ \ how earnestly would you search out the most approved means of relief; how trifling,\
\ how trifling, how insignificant, by comparison, would appear the sacrifices you\ \ how insignificant, by comparison, would appear the sacrifices you are asked to\
\ are asked to make; how would a few dimes and dollars, gathered from each citizen,\ \ make; how would a few dimes and dollars, gathered from each citizen, diminish\
\ diminish in value as a possession, compared with the certain benefits and vast\ \ in value as a possession, compared with the certain benefits and vast good to\
\ good to be secured for the suffering insane...by the consecration and application\ \ be secured for the suffering insane...by the consecration and application of a\
\ of a sufficient fund to the construction of a suitable hospital.\u2026\n\u2014\ \ sufficient fund to the construction of a suitable hospital.…\n—Dorothea Dix, Memorial\
Dorothea Dix, Memorial Soliciting a State Hospital for the Protection and Cure of\ \ Soliciting a State Hospital for the Protection and Cure of the Insane,\nSubmitted\
\ the Insane,\nSubmitted to the General Assembly of North Carolina, November 1848\n\ \ to the General Assembly of North Carolina, November 1848\nDorothea Dix can best\
Dorothea Dix can best be compared to whom?\n(A) Abigail Adams (B) Clara Barton (C)\ \ be compared to whom?\n(A) Abigail Adams (B) Clara Barton (C) Shirley Temple (D)\
\ Shirley Temple (D) Hillary Clinton\nA: Let's think step by step. We refer to Wikipedia\ \ Hillary Clinton\nA: Let's think step by step. We refer to Wikipedia articles on\
\ articles on us history for help. Both Dorothea Dix and Clara barton are American\ \ us history for help. Both Dorothea Dix and Clara barton are American nurses. The\
\ nurses. The answer is (B).\n\nQ: This question refers to the following information.\n\ \ answer is (B).\n\nQ: This question refers to the following information.\n\"As\
\"As our late Conduct at the Conestoga Manor and Lancaster have occasioned much\ \ our late Conduct at the Conestoga Manor and Lancaster have occasioned much Speculation\
\ Speculation & a great diversity of Sentiments in this and neighboring Governments;\ \ & a great diversity of Sentiments in this and neighboring Governments; some vindicating\
\ some vindicating & others condemning it; some charitably alleviating the Crime,\ \ & others condemning it; some charitably alleviating the Crime, & others maliciously\
\ & others maliciously painting it in the most odious & detestable Colours, we think\ \ painting it in the most odious & detestable Colours, we think it our duty to lay\
\ it our duty to lay before the Publick, the whole Matter as it appeared, & still\ \ before the Publick, the whole Matter as it appeared, & still appears, to us. .\
\ appears, to us. . . .\n\"If these things are not sufficient to prove an unjustifiable\ \ . .\n\"If these things are not sufficient to prove an unjustifiable Attachment\
\ Attachment in the Quakers to the Indians Savages, a fixed Resolution to befriend\ \ in the Quakers to the Indians Savages, a fixed Resolution to befriend them & an\
\ them & an utter insensibility to human Distresses, let us consider a few more\ \ utter insensibility to human Distresses, let us consider a few more recent Facts.\
\ recent Facts. When we found the last Summer that we were likely to get no Assistance\ \ When we found the last Summer that we were likely to get no Assistance from the\
\ from the Government, some Volunteers went out at our own Expense, determined to\ \ Government, some Volunteers went out at our own Expense, determined to drive our\
\ drive our Enemies from our Borders; & when we came near to the great Island, we\ \ Enemies from our Borders; & when we came near to the great Island, we understood\
\ understood that a Number of their Warriors had gone out against our Frontiers.\ \ that a Number of their Warriors had gone out against our Frontiers. Upon this\
\ Upon this we returned and came up with them and fought with them at the Munfey\ \ we returned and came up with them and fought with them at the Munfey Hill where\
\ Hill where we lost some of our Men & killed some of their Warriors & thereby saved\ \ we lost some of our Men & killed some of their Warriors & thereby saved our Frontiers\
\ our Frontiers from this Story in another Expedition. But no sooner had we destroyed\ \ from this Story in another Expedition. But no sooner had we destroyed their Provisions\
\ their Provisions on the great Island, & ruined their trade with the good People\ \ on the great Island, & ruined their trade with the good People at Bethlehem, but\
\ at Bethlehem, but these very Indians, who were justly suspected of having murdered\ \ these very Indians, who were justly suspected of having murdered our Friends in\
\ our Friends in Northampton County, were by the Influence of some Quakers taken\ \ Northampton County, were by the Influence of some Quakers taken under the Protection\
\ under the Protection of the Government to screen them from the Resentments of\ \ of the Government to screen them from the Resentments of the Friends and Relations\
\ the Friends and Relations of the Murdered, & to support them thro the Winter.\"\ \ of the Murdered, & to support them thro the Winter.\"\n\"Apology of the Paxton\
\n\u2014\"Apology of the Paxton Boys\" (pamphlet), 1764 (Note: \"apology\" in this\ \ Boys\" (pamphlet), 1764 (Note: \"apology\" in this context should be read as an\
\ context should be read as an explanation, not an admission of guilt or regret.\n\ \ explanation, not an admission of guilt or regret.\nThe sentiments expressed in\
The sentiments expressed in the explanation above reflect which of the ongoing tensions\ \ the explanation above reflect which of the ongoing tensions during the colonial\
\ during the colonial period of American history?\n(A) Tensions between British\ \ period of American history?\n(A) Tensions between British policies and the aspirations\
\ policies and the aspirations of North American colonists. (B) Tensions between\ \ of North American colonists. (B) Tensions between American Indians allied with\
\ American Indians allied with the French and those allied with the British. (C)\ \ the French and those allied with the British. (C) Tensions between freed African\
\ Tensions between freed African Americans and white planters. (D) Tensions between\ \ Americans and white planters. (D) Tensions between backcountry settlers and elites\
\ backcountry settlers and elites within colonial America.\nA: Let's think step\ \ within colonial America.\nA: Let's think step by step. We refer to Wikipedia articles\
\ by step. We refer to Wikipedia articles on us history for help. After the French\ \ on us history for help. After the French and Indian War, the Scotch-Irish settlers\
\ and Indian War, the Scotch-Irish settlers attacked American Indians. After the\ \ attacked American Indians. After the attacks on the Conestoga, about 250 Paxton\
\ attacks on the Conestoga, about 250 Paxton Boys present their grievances to the\ \ Boys present their grievances to the Pennsylvania legislature. As mentioned in\
\ Pennsylvania legislature. As mentioned in the information, the Paxton Boys cited\ \ the information, the Paxton Boys cited resentiment at local elites. The answer\
\ resentiment at local elites. The answer is (D).\n\nQ: This question refers to\ \ is (D).\n\nQ: This question refers to the following information.\nOur leaders\
\ the following information.\nOur leaders talk about stopping aggression from the\ \ talk about stopping aggression from the north, but this was a struggle among groups\
\ north, but this was a struggle among groups of Vietnamese until we intervened.\ \ of Vietnamese until we intervened. We seem bent upon saving the Vietnamese from\
\ We seem bent upon saving the Vietnamese from Ho Chi Minh even if we have to kill\ \ Ho Chi Minh even if we have to kill them and demolish their country to do it.\
\ them and demolish their country to do it. As the native people survey bombed-out\ \ As the native people survey bombed-out villages, women and children burned by\
\ villages, women and children burned by napalm, rice crops destroyed and cities\ \ napalm, rice crops destroyed and cities overrun with our military personnel, they\
\ overrun with our military personnel, they are doubtless saying secretly of the\ \ are doubtless saying secretly of the Vietcong guerillas and of the American forces,\
\ Vietcong guerillas and of the American forces, \"A plague on both your houses.\"\ \ \"A plague on both your houses.\" Stop the bombing, north and south, end search\
\ \u2026 Stop the bombing, north and south, end search and destroy offensive sweeps,\ \ and destroy offensive sweeps, and confine our military action to holding operations\
\ and confine our military action to holding operations on the ground. Bombing the\ \ on the ground. Bombing the north has failed to halt or seriously check the flow\
\ north has failed to halt or seriously check the flow of troops to the south and\ \ of troops to the south and may, in fact, have prompted a much greater war effort\
\ may, in fact, have prompted a much greater war effort by Hanoi.\n\u2014Senator\ \ by Hanoi.\n—Senator George McGovern, \"The Lessons of Vietnam,\" April 25, 1967\n\
\ George McGovern, \"The Lessons of Vietnam,\" April 25, 1967\nWhich of the following\ Which of the following opinions from the 1960s most directly reflects the perspective\
\ opinions from the 1960s most directly reflects the perspective of George McGovern's\ \ of George McGovern's speech?\n(A) Americans must maximize their technological\
\ speech?\n(A) Americans must maximize their technological edge in Vietnam. (B)\ \ edge in Vietnam. (B) American bombing in Vietnam is step by step leading to progress\
\ American bombing in Vietnam is step by step leading to progress in the war. (C)\ \ in the war. (C) American bombing in Vietnam is a failure. (D) America must not\
\ American bombing in Vietnam is a failure. (D) America must not give in to defeatism\ \ give in to defeatism about the war in Vietnam.\nA: Let's think step by step. We\
\ about the war in Vietnam.\nA: Let's think step by step. We refer to Wikipedia\ \ refer to Wikipedia articles on us history for help. \"Stop the bombing\" and \"\
\ articles on us history for help. \"Stop the bombing\" and \"Bombing the north\ Bombing the north has failed to halt or seriously check the flow of troops to the\
\ has failed to halt or seriously check the flow of troops to the south\" indicate\ \ south\" indicate that the perspective of George McGovern's speech is that Amerian\
\ that the perspective of George McGovern's speech is that Amerian bombing in Vietnam\ \ bombing in Vietnam is a failure. The answer is (C).\n\nQ: This question refers\
\ is a failure. The answer is (C).\n\nQ: This question refers to the following information.\n\ \ to the following information.\n\"In the new Code of Laws which I suppose it will\
\"In the new Code of Laws which I suppose it will be necessary for you to make I\ \ be necessary for you to make I desire you would Remember the Ladies, and be more\
\ desire you would Remember the Ladies, and be more generous and favorable to them\ \ generous and favorable to them than your ancestors. Do not put such unlimited\
\ than your ancestors. Do not put such unlimited power into the hands of the Husbands.\ \ power into the hands of the Husbands. Remember all Men would be tyrants if they\
\ Remember all Men would be tyrants if they could. If particular care and attention\ \ could. If particular care and attention is not paid to the Ladies we are determined\
\ is not paid to the Ladies we are determined to foment a Rebellion, and will not\ \ to foment a Rebellion, and will not hold ourselves bound by any Laws in which\
\ hold ourselves bound by any Laws in which we have no voice, or Representation.\"\ \ we have no voice, or Representation.\"\nAbigail Adams, in a letter to John Adams,\
\nAbigail Adams, in a letter to John Adams, 1776\n\"Special legislation for woman\ \ 1776\n\"Special legislation for woman has placed us in a most anomalous position.\
\ has placed us in a most anomalous position. Women invested with the rights of\ \ Women invested with the rights of citizens in one section—voters, jurors, office-holders—crossing\
\ citizens in one section\u2014voters, jurors, office-holders\u2014crossing an imaginary\ \ an imaginary line, are subjects in the next. In some States, a married woman may\
\ line, are subjects in the next. In some States, a married woman may hold property\ \ hold property and transact business in her own name; in others, her earnings belong\
\ and transact business in her own name; in others, her earnings belong to her husband.\ \ to her husband. In some States, a woman may testify against her husband, sue and\
\ In some States, a woman may testify against her husband, sue and be sued in the\ \ be sued in the courts; in others, she has no redress in case of damage to person,\
\ courts; in others, she has no redress in case of damage to person, property, or\ \ property, or character. In case of divorce on account of adultery in the husband,\
\ character. In case of divorce on account of adultery in the husband, the innocent\ \ the innocent wife is held to possess no right to children or property, unless\
\ wife is held to possess no right to children or property, unless by special decree\ \ by special decree of the court. But in no State of the Union has the wife the\
\ of the court. But in no State of the Union has the wife the right to her own person,\ \ right to her own person, or to any part of the joint earnings of the co-partnership\
\ or to any part of the joint earnings of the co-partnership during the life of\ \ during the life of her husband. In some States women may enter the law schools\
\ her husband. In some States women may enter the law schools and practice in the\ \ and practice in the courts; in others they are forbidden. In some universities\
\ courts; in others they are forbidden. In some universities girls enjoy equal educational\ \ girls enjoy equal educational advantages with boys, while many of the proudest\
\ advantages with boys, while many of the proudest institutions in the land deny\ \ institutions in the land deny them admittance, though the sons of China, Japan\
\ them admittance, though the sons of China, Japan and Africa are welcomed there.\ \ and Africa are welcomed there. But the privileges already granted in the several\
\ But the privileges already granted in the several States are by no means secure.\"\ \ States are by no means secure.\"\nSusan B. Anthony, \"Declaration of Rights for\
\nSusan B. Anthony, \"Declaration of Rights for Women,\" July 4, 1876\nThe sentiments\ \ Women,\" July 4, 1876\nThe sentiments expressed in the second excerpt by Susan\
\ expressed in the second excerpt by Susan B. Anthony are most likely in support\ \ B. Anthony are most likely in support of\n(A) the Equal Rights Amendment (B) universal\
\ of\n(A) the Equal Rights Amendment (B) universal suffrage (C) states' rights (D)\ \ suffrage (C) states' rights (D) prohibition\nA: Let's think step by step. We refer\
\ prohibition\nA: Let's think step by step. We refer to Wikipedia articles on us\ \ to Wikipedia articles on us history for help. The above information mentioned\
\ history for help. The above information mentioned that women are in an anomalous\ \ that women are in an anomalous position in terms of legislation. Women's earnings\
\ position in terms of legislation. Women's earnings do not belong to themselves,\ \ do not belong to themselves, or they cannot testify against her husbands. Susan\
\ or they cannot testify against her husbands. Susan believes women should have\ \ believes women should have equal legal rights as men. The answer is (B).\n\nQ:\
\ equal legal rights as men. The answer is (B).\n\nQ: This question refers to the\ \ This question refers to the following information.\n\"Society in every state is\
\ following information.\n\"Society in every state is a blessing, but government\ \ a blessing, but government even in its best state is but a necessary evil; in\
\ even in its best state is but a necessary evil; in its worst state an intolerable\ \ its worst state an intolerable one; for when we suffer, or are exposed to the\
\ one; for when we suffer, or are exposed to the same miseries by a government,\ \ same miseries by a government, which we might expect in a country without government,\
\ which we might expect in a country without government, our calamity is heightened\ \ our calamity is heightened by reflecting that we furnish the means by which we\
\ by reflecting that we furnish the means by which we suffer. Government, like dress,\ \ suffer. Government, like dress, is the badge of lost innocence; the palaces of\
\ is the badge of lost innocence; the palaces of kings are built on the ruins of\ \ kings are built on the ruins of the bowers of paradise. For were the impulses\
\ the bowers of paradise. For were the impulses of conscience clear, uniform, and\ \ of conscience clear, uniform, and irresistibly obeyed, man would need no other\
\ irresistibly obeyed, man would need no other lawgiver; but that not being the\ \ lawgiver; but that not being the case, he finds it necessary to surrender up a\
\ case, he finds it necessary to surrender up a part of his property to furnish\ \ part of his property to furnish means for the protection of the rest; and this\
\ means for the protection of the rest; and this he is induced to do by the same\ \ he is induced to do by the same prudence which in every other case advises him\
\ prudence which in every other case advises him out of two evils to choose the\ \ out of two evils to choose the least. Wherefore, security being the true design\
\ least. Wherefore, security being the true design and end of government, it unanswerably\ \ and end of government, it unanswerably follows that whatever form thereof appears\
\ follows that whatever form thereof appears most likely to ensure it to us, with\ \ most likely to ensure it to us, with the least expense and greatest benefit, is\
\ the least expense and greatest benefit, is preferable to all others.\"\nThomas\ \ preferable to all others.\"\nThomas Paine, Common Sense, 1776\nWhich of the following\
\ Paine, Common Sense, 1776\nWhich of the following \"miseries\" alluded to above\ \ \"miseries\" alluded to above were most condemned by Anti-Federalists of the post-Revolutionary\
\ were most condemned by Anti-Federalists of the post-Revolutionary era?\n(A) Organized\ \ era?\n(A) Organized response to Bacon's Rebellion (B) Federal response to Shays's\
\ response to Bacon's Rebellion (B) Federal response to Shays's Rebellion (C) Federal\ \ Rebellion (C) Federal response to the Whiskey Rebellion (D) Federal response to\
\ response to the Whiskey Rebellion (D) Federal response to Pontiac's Rebellion\n\ \ Pontiac's Rebellion\nA: Let's think step by step. We refer to Wikipedia articles\
A: Let's think step by step. We refer to Wikipedia articles on us history for help.\ \ on us history for help. Anti-Federalists do not believe centralized government\
\ Anti-Federalists do not believe centralized government power, and suspect Washington's\ \ power, and suspect Washington's military response to Whiskey Rebellion. Bacon's\
\ military response to Whiskey Rebellion. Bacon's Rebellion and Pontiac's Rebellion\ \ Rebellion and Pontiac's Rebellion happen before the Revolution and they can be\
\ happen before the Revolution and they can be ruled out. The answer is (C)." \ ruled out. The answer is (C)."
include: _mmlu_flan_cot_fewshot_template_yaml "group": "mmlu_flan_cot_fewshot_humanities"
task: mmlu_flan_cot_fewshot_high_school_us_history "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_high_school_us_history"
dataset_name: high_school_world_history "dataset_name": "high_school_world_history"
description: "The following are multiple choice questions (with answers) about high\ "description": "The following are multiple choice questions (with answers) about high\
\ school world history.\n\nQ: This question refers to the following information.\n\ \ school world history.\n\nQ: This question refers to the following information.\n\
\"At least one of the [world's] societies would have to somehow enormously increase\ \"At least one of the [world's] societies would have to somehow enormously increase\
\ its productivity [in order to achieve global hegemony]. That quantum jump would\ \ its productivity [in order to achieve global hegemony]. That quantum jump would\
...@@ -8,75 +8,75 @@ description: "The following are multiple choice questions (with answers) about h ...@@ -8,75 +8,75 @@ description: "The following are multiple choice questions (with answers) about h
\ be accomplished by exploiting the ecosystems, mineral resources, and human assets\ \ be accomplished by exploiting the ecosystems, mineral resources, and human assets\
\ of whole continents outside the lands of the society making the jump. Western\ \ of whole continents outside the lands of the society making the jump. Western\
\ Europe did just that by means of its brutality and guns and, more important, by\ \ Europe did just that by means of its brutality and guns and, more important, by\
\ geographical and ecological luck.\"\nCopyright \xA9 2015 Cambridge University\ \ geographical and ecological luck.\"\nCopyright © 2015 Cambridge University Press.\n\
\ Press.\nAlfred Crosby, historian, Ecological Imperialism, 2004\nThe \"quantum\ Alfred Crosby, historian, Ecological Imperialism, 2004\nThe \"quantum jump\" mentioned\
\ jump\" mentioned in the passage most directly contributed to which of the following\ \ in the passage most directly contributed to which of the following developments\
\ developments in the period 1450\u20131750 C.E.?\n(A) A breakdown in trade routes\ \ in the period 14501750 C.E.?\n(A) A breakdown in trade routes through the collapse\
\ through the collapse of the established state structure (B) An increase in the\ \ of the established state structure (B) An increase in the population of the world\
\ population of the world through more plentiful supplies of food (C) The spread\ \ through more plentiful supplies of food (C) The spread of Chinese and Indian belief\
\ of Chinese and Indian belief systems across the world (D) An increase in social\ \ systems across the world (D) An increase in social unrest\nA: Let's think step\
\ unrest\nA: Let's think step by step. We refer to Wikipedia articles on world history\ \ by step. We refer to Wikipedia articles on world history for help. The \"quantum\
\ for help. The \"quantum jump\" mentioned in the passage refers to the conquest\ \ jump\" mentioned in the passage refers to the conquest of the New World and the\
\ of the New World and the Columbian Exchange. Choice (A) and (C) did not happen\ \ Columbian Exchange. Choice (A) and (C) did not happen in history. Choice (C) refers\
\ in history. Choice (C) refers to the human assets. The answer is (B).\n\nQ: This\ \ to the human assets. The answer is (B).\n\nQ: This question refers to the following\
\ question refers to the following information.\n\"The struggle against neo-colonialism\ \ information.\n\"The struggle against neo-colonialism is not aimed at excluding\
\ is not aimed at excluding the capital of the developed world from operating in\ \ the capital of the developed world from operating in less developed countries.\
\ less developed countries. It is aimed at preventing the financial power of the\ \ It is aimed at preventing the financial power of the developed countries being\
\ developed countries being used in such a way as to impoverish the less developed.\n\ \ used in such a way as to impoverish the less developed.\nNon-alignment, as practiced\
Non-alignment, as practiced by Ghana and many other countries, is based on co-operation\ \ by Ghana and many other countries, is based on co-operation with all States whether\
\ with all States whether they be capitalist, socialist or have a mixed economy.\ \ they be capitalist, socialist or have a mixed economy. Such a policy, therefore,\
\ Such a policy, therefore, involves foreign investment from capitalist countries,\ \ involves foreign investment from capitalist countries, but it must be invested\
\ but it must be invested in accordance with a national plan drawn up by the government\ \ in accordance with a national plan drawn up by the government of the non-aligned\
\ of the non-aligned State with its own interests in mind. The issue is not what\ \ State with its own interests in mind. The issue is not what return the foreign\
\ return the foreign investor receives on his investments\u2026The question is one\ \ investor receives on his investmentsThe question is one of power. A State in\
\ of power. A State in the grip of neo-colonialism is not master of its own destiny.\"\ \ the grip of neo-colonialism is not master of its own destiny.\"\nKwame Nkrumah,\
\nKwame Nkrumah, Neo-Colonialism, 1965\nWhich of the following provides the best\ \ Neo-Colonialism, 1965\nWhich of the following provides the best context for Nkrumah's\
\ context for Nkrumah's writings?\n(A) The Industrial Revolution (B) Decolonization\ \ writings?\n(A) The Industrial Revolution (B) Decolonization (C) Regional Free\
\ (C) Regional Free Trade Associations (D) Autarky\nA: Let's think step by step.\ \ Trade Associations (D) Autarky\nA: Let's think step by step. We refer to Wikipedia\
\ We refer to Wikipedia articles on world history for help. The passage expresses\ \ articles on world history for help. The passage expresses a point that the successful\
\ a point that the successful fight against neo-colonialism were in danger and the\ \ fight against neo-colonialism were in danger and the newly independent nations\
\ newly independent nations like Ghana may be re-colonized via financial power of\ \ like Ghana may be re-colonized via financial power of the developed countries.\
\ the developed countries. The answer is (B).\n\nQ: This question refers to the\ \ The answer is (B).\n\nQ: This question refers to the following information.\n\"\
\ following information.\n\"Indeed, as both the fatwas of distinguished [scholars]\ Indeed, as both the fatwas of distinguished [scholars] who base their opinion on\
\ who base their opinion on reason and tradition alike and the consensus of the\ \ reason and tradition alike and the consensus of the Sunni community agree that\
\ Sunni community agree that the ancient obligation of extirpation, extermination,\ \ the ancient obligation of extirpation, extermination, and expulsion of evil innovation\
\ and expulsion of evil innovation must be the aim of our exalted aspiration, for\ \ must be the aim of our exalted aspiration, for \"Religious zeal is a victory for\
\ \"Religious zeal is a victory for the Faith of God the Beneficent\"; then, in\ \ the Faith of God the Beneficent\"; then, in accordance with the words of the Prophet\
\ accordance with the words of the Prophet (Peace upon him!) \"Whosoever introduces\ \ (Peace upon him!) \"Whosoever introduces evil innovation into our order must be\
\ evil innovation into our order must be expelled\" and \"Whosoever does aught against\ \ expelled\" and \"Whosoever does aught against our order must be expelled,\" action\
\ our order must be expelled,\" action has become necessary and exigent\u2026\"\n\ \ has become necessary and exigent…\"\nLetter from Ottoman Sultan Selim I to Safavid\
Letter from Ottoman Sultan Selim I to Safavid Shah Ismail I, 1514\nThe letter from\ \ Shah Ismail I, 1514\nThe letter from Selim I is most clearly an example of which\
\ Selim I is most clearly an example of which of the following?\n(A) The maintenance\ \ of the following?\n(A) The maintenance of military supremacy at all costs (B)\
\ of military supremacy at all costs (B) Expanding tensions between religious sects\ \ Expanding tensions between religious sects (C) Factors that brought about the\
\ (C) Factors that brought about the collapse of the Ottoman Empire (D) Peacemaking\ \ collapse of the Ottoman Empire (D) Peacemaking efforts among the Islamic empires\n\
\ efforts among the Islamic empires\nA: Let's think step by step. We refer to Wikipedia\ A: Let's think step by step. We refer to Wikipedia articles on world history for\
\ articles on world history for help. The passage is an example of expanding tensions\ \ help. The passage is an example of expanding tensions between Selim and Ismail.\
\ between Selim and Ismail. In the passage the Selim references the fatwa and the\ \ In the passage the Selim references the fatwa and the consensus of the Sunni community\
\ consensus of the Sunni community to against whosoever introduces evil. The answer\ \ to against whosoever introduces evil. The answer is (B).\n\nQ: This question refers\
\ is (B).\n\nQ: This question refers to the following information.\n\"The real grievance\ \ to the following information.\n\"The real grievance of the worker is the insecurity\
\ of the worker is the insecurity of his existence; he is not sure that he will\ \ of his existence; he is not sure that he will always have work, he is not sure\
\ always have work, he is not sure that he will always be healthy, and he foresees\ \ that he will always be healthy, and he foresees that he will one day be old and\
\ that he will one day be old and unfit to work. If he falls into poverty, even\ \ unfit to work. If he falls into poverty, even if only through a prolonged illness,\
\ if only through a prolonged illness, he is then completely helpless, exam_ins\ \ he is then completely helpless, exam_ins to his own devices, and society does\
\ to his own devices, and society does not currently recognize any real obligation\ \ not currently recognize any real obligation towards him beyond the usual help\
\ towards him beyond the usual help for the poor, even if he has been working all\ \ for the poor, even if he has been working all the time ever so faithfully and\
\ the time ever so faithfully and diligently. The usual help for the poor, however,\ \ diligently. The usual help for the poor, however, leaves a lot to be desired,\
\ leaves a lot to be desired, especially in large cities, where it is very much\ \ especially in large cities, where it is very much worse than in the country.\"\
\ worse than in the country.\"\nOtto von Bismarck, 1884\nOtto von Bismarck likely\ \nOtto von Bismarck, 1884\nOtto von Bismarck likely made this speech in reaction\
\ made this speech in reaction to which of the following issues?\n(A) Social acceptance\ \ to which of the following issues?\n(A) Social acceptance of child labor (B) Declining\
\ of child labor (B) Declining life expectancy in Germany (C) Criticisms of German\ \ life expectancy in Germany (C) Criticisms of German trade tariffs (D) Negative\
\ trade tariffs (D) Negative effects attributed to industrial capitalism\nA: Let's\ \ effects attributed to industrial capitalism\nA: Let's think step by step. We refer\
\ think step by step. We refer to Wikipedia articles on world history for help.\ \ to Wikipedia articles on world history for help. The passage talks about the grievance\
\ The passage talks about the grievance of the work under the industrial capitalism.\ \ of the work under the industrial capitalism. The answer is (D).\n\nQ: This question\
\ The answer is (D).\n\nQ: This question refers to the following information.\n\ \ refers to the following information.\nHe contains all works and desires and all\
He contains all works and desires and all perfumes and all tastes. He enfolds the\ \ perfumes and all tastes. He enfolds the whole universe and in silence is loving\
\ whole universe and in silence is loving to all. This is the Spirit that is in\ \ to all. This is the Spirit that is in my heart, this is Brahman. To him I shall\
\ my heart, this is Brahman. To him I shall come when I go beyond this life, and\ \ come when I go beyond this life, and to him will come he who has faith and doubts\
\ to him will come he who has faith and doubts not.\n\u2014The Upanishads, India,\ \ not.\n—The Upanishads, India, c. 1000 BCE\nTo which religion does the speaker\
\ c. 1000 BCE\nTo which religion does the speaker most likely belong?\n(A) Hinduism\ \ most likely belong?\n(A) Hinduism (B) Buddhism (C) Shintoism (D) Zoroastrianism\n\
\ (B) Buddhism (C) Shintoism (D) Zoroastrianism\nA: Let's think step by step. We\ A: Let's think step by step. We refer to Wikipedia articles on world history for\
\ refer to Wikipedia articles on world history for help. Brahman refers to the ultimate\ \ help. Brahman refers to the ultimate reality of all things in the Hindu religion.\
\ reality of all things in the Hindu religion. In contrast, Buddhism does not have\ \ In contrast, Buddhism does not have a concept of supreme God. The answer is (A)."
\ a concept of supreme God. The answer is (A)." "group": "mmlu_flan_cot_fewshot_humanities"
include: _mmlu_flan_cot_fewshot_template_yaml "include": "_mmlu_flan_cot_fewshot_template_yaml"
task: mmlu_flan_cot_fewshot_high_school_world_history "task": "mmlu_flan_cot_fewshot_high_school_world_history"
dataset_name: human_aging "dataset_name": "human_aging"
description: 'The following are multiple choice questions (with answers) about human "description": "The following are multiple choice questions (with answers) about human\
aging. \ aging.\n\nQ: All other things being equal, which of the following persons is more\
\ likely to show osteoporosis?\n(A) An older Hispanic American woman (B) An older\
\ African American woman (C) An older Asian American woman (D) An older Native American\
Q: All other things being equal, which of the following persons is more likely to \ woman\nA: Let's think step by step. We refer to Wikipedia articles on human aging\
show osteoporosis? \ for help. Although osteoporosis can occur at any age, the risk is higher for older\
\ people. It is most common in Asian and non-Hispanic white women. The answer is\
(A) An older Hispanic American woman (B) An older African American woman (C) An \ (C).\n\nQ: The finding that adults tend to remember events from their adolescence\
older Asian American woman (D) An older Native American woman \ better than from other periods in their lives is referred to as the\n(A) Adolescence\
\ advantage (B) Reminiscence bump (C) Memorial memorial (D) Quadratic retrieval\
A: Let''s think step by step. We refer to Wikipedia articles on human aging for \ spike\nA: Let's think step by step. We refer to Wikipedia articles on human aging\
help. Although osteoporosis can occur at any age, the risk is higher for older people. \ for help. Reminiscence bump is a phenomenon that older adults tend to recollect\
It is most common in Asian and non-Hispanic white women. The answer is (C). \ events during their young ages. People usually have a period of childhood amnesia\
\ from birth to around age 5, and a reminiscence bump between 10 and 30. The answer\
\ is (B).\n\nQ: Which element in tobacco smoke is responsible for cancers?\n(A)\
Q: The finding that adults tend to remember events from their adolescence better \ Nicotine (B) Tar (C) Carbon monoxide (D) Smoke particles\nA: Let's think step\
than from other periods in their lives is referred to as the \ by step. We refer to Wikipedia articles on human aging for help. The benzene,\
\ acrylamide and acrylonitrile in tar interact with the lungs and cause DNA mutations\
(A) Adolescence advantage (B) Reminiscence bump (C) Memorial memorial (D) Quadratic \ in cells of the lungs, and lead to cancer. The answer is (B).\n\nQ: When older\
retrieval spike \ adults move to a new state after retirement, which of the following is the more\
\ likely destination?\n(A) Texas (B) California (C) Hawaii (D) Vermont\nA: Let's\
A: Let''s think step by step. We refer to Wikipedia articles on human aging for \ think step by step. We refer to Wikipedia articles on human aging for help. Texas\
help. Reminiscence bump is a phenomenon that older adults tend to recollect events \ does not have state tax, and has low cost of living compared with the other three\
during their young ages. People usually have a period of childhood amnesia from \ options. The answer is (A)."
birth to around age 5, and a reminiscence bump between 10 and 30. The answer is "group": "mmlu_flan_cot_fewshot_other"
(B). "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_human_aging"
Q: Which element in tobacco smoke is responsible for cancers?
(A) Nicotine (B) Tar (C) Carbon monoxide (D) Smoke particles
A: Let''s think step by step. We refer to Wikipedia articles on human aging for
help. The benzene, acrylamide and acrylonitrile in tar interact with the lungs and
cause DNA mutations in cells of the lungs, and lead to cancer. The answer is (B).
Q: When older adults move to a new state after retirement, which of the following
is the more likely destination?
(A) Texas (B) California (C) Hawaii (D) Vermont
A: Let''s think step by step. We refer to Wikipedia articles on human aging for
help. Texas does not have state tax, and has low cost of living compared with the
other three options. The answer is (A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_human_aging
dataset_name: human_sexuality "dataset_name": "human_sexuality"
description: 'The following are multiple choice questions (with answers) about human "description": "The following are multiple choice questions (with answers) about human\
sexuality. \ sexuality.\n\nQ: The most common disorder among men who seek sexual therapy is:\n\
(A) premature ejaculation (B) inhibited ejaculation (C) erectile disorder (D) ejaculatory\
\ disorder\nA: Let's think step by step. We refer to Wikipedia articles on human\
Q: The most common disorder among men who seek sexual therapy is: \ sexuality for help. The most common type of arousal disorder for men is erectile\
\ dysfunction, meaning that a person is not able to get any physical satisfaction\
(A) premature ejaculation (B) inhibited ejaculation (C) erectile disorder (D) ejaculatory \ from sexual activity although he may be interested in it. The answer is (C).\n\
disorder \nQ: A woman who knows she has active herpes and untreated syphilis but continues\
\ to have sex without informing her partners of her condition has, in psychoanalytic\
A: Let''s think step by step. We refer to Wikipedia articles on human sexuality \ terms:\n(A) a strong ego (B) a weak superego (C) a weak id (D) a strong superego\n\
for help. The most common type of arousal disorder for men is erectile dysfunction, A: Let's think step by step. We refer to Wikipedia articles on human sexuality for\
meaning that a person is not able to get any physical satisfaction from sexual activity \ help. A person with weak superego tends to be delinquent, criminal or have antisocial\
although he may be interested in it. The answer is (C). \ personality. The action of the woman who knows she has active venereal disease\
\ but still have sex with her partners indicate she may has antisocial personality.\
\ The answer is (B).\n\nQ: The nature of homosexual activities that occur during\
Q: A woman who knows she has active herpes and untreated syphilis but continues \ preadolescence include all but which of the following?\n(A) sexual intercourse\
to have sex without informing her partners of her condition has, in psychoanalytic \ (B) circle jerks (C) exhibitionism (D) touching each other's genitals\nA: Let's\
terms: \ think step by step. We refer to Wikipedia articles on human sexuality for help.\
\ Sexual intercourse is prohibited during preadolescence. The answer is (A).\n\n\
(A) a strong ego (B) a weak superego (C) a weak id (D) a strong superego Q: Women's ability to have multiple orgasms is primarily due to:\n(A) the fact that\
\ they do not have a refractory period. (B) the response of the inner layers of\
A: Let''s think step by step. We refer to Wikipedia articles on human sexuality \ the vagina. (C) having alternating orgasms in different locations. (D) the G-Spot.\n\
for help. A person with weak superego tends to be delinquent, criminal or have antisocial A: Let's think step by step. We refer to Wikipedia articles on human sexuality for\
personality. The action of the woman who knows she has active venereal disease but \ help. The refractory period is the time when a person is not able to be erect\
still have sex with her partners indicate she may has antisocial personality. The \ or is not interested in sex. The answer is (A).\n\nQ: Morning sickness is typically\
answer is (B). \ a problem:\n(A) during the first trimester (B) during the second trimester (C)\
\ during the third trimester (D) all through the pregnancy\nA: Let's think step\
\ by step. We refer to Wikipedia articles on human sexuality for help. Morning sickness\
Q: The nature of homosexual activities that occur during preadolescence include \ usually begins by nine weeks after conception, corresponding to the first trimester.\
all but which of the following? \ The answer is (A)."
"group": "mmlu_flan_cot_fewshot_social_sciences"
(A) sexual intercourse (B) circle jerks (C) exhibitionism (D) touching each other''s "include": "_mmlu_flan_cot_fewshot_template_yaml"
genitals "task": "mmlu_flan_cot_fewshot_human_sexuality"
A: Let''s think step by step. We refer to Wikipedia articles on human sexuality
for help. Sexual intercourse is prohibited during preadolescence. The answer is
(A).
Q: Women''s ability to have multiple orgasms is primarily due to:
(A) the fact that they do not have a refractory period. (B) the response of the
inner layers of the vagina. (C) having alternating orgasms in different locations.
(D) the G-Spot.
A: Let''s think step by step. We refer to Wikipedia articles on human sexuality
for help. The refractory period is the time when a person is not able to be erect
or is not interested in sex. The answer is (A).
Q: Morning sickness is typically a problem:
(A) during the first trimester (B) during the second trimester (C) during the third
trimester (D) all through the pregnancy
A: Let''s think step by step. We refer to Wikipedia articles on human sexuality
for help. Morning sickness usually begins by nine weeks after conception, corresponding
to the first trimester. The answer is (A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_human_sexuality
dataset_name: international_law "dataset_name": "international_law"
description: 'The following are multiple choice questions (with answers) about international "description": "The following are multiple choice questions (with answers) about international\
law. \ law.\n\nQ: How the consent to be bound of a State may be expressed?\n(A) The consent\
\ of a State to be bound is expressed only by ratification (B) The consent of a\
\ state to be bound by a treaty may be expressed by signature, ratification, acceptance,\
Q: How the consent to be bound of a State may be expressed? \ approval or accession (C) The consent of a State to be bound is expressed by signature\
\ (D) The consent of a State to be bound is expressed by whatever means they choose\n\
(A) The consent of a State to be bound is expressed only by ratification (B) The A: Let's think step by step. We refer to Wikipedia articles on international law\
consent of a state to be bound by a treaty may be expressed by signature, ratification, \ for help. Article 11 of Vienna Convention on the Law of Treaties signed in 1969\
acceptance, approval or accession (C) The consent of a State to be bound is expressed \ states that \"the consent of a State to be bound by a treaty may be expressed\
by signature (D) The consent of a State to be bound is expressed by whatever means \ by signature, exchange of instruments constituting a treaty, ratification, acceptance,\
they choose \ approval or accession, or by any other means if so agreed.\" (B) is the most precise\
\ and accurate answer. The answer is (B).\n\nQ: What is the judge ad hoc?\n(A) If\
A: Let''s think step by step. We refer to Wikipedia articles on international law \ a party to a contentious case before the ICJ does not have a national sitting\
for help. Article 11 of Vienna Convention on the Law of Treaties signed in 1969 \ as judge, it is entitled to nominate someone as a judge solely for that case,\
states that "the consent of a State to be bound by a treaty may be expressed by \ with the title of judge ad hoc (B) Judge ad hoc is the member of the bench of\
signature, exchange of instruments constituting a treaty, ratification, acceptance, \ the ICJ with a casting vote (C) Judge ad hoc is a surrogate judge, in case a judge\
approval or accession, or by any other means if so agreed." (B) is the most precise \ is disqualified or passes away (D) Judge ad hoc is the judge that each party will\
and accurate answer. The answer is (B). \ always nominate in every contentious case\nA: Let's think step by step. We refer\
\ to Wikipedia articles on international law for help. As \"ad hoc\" implies, a\
\ judge ad hoc is appointed only for a specific case or period, when a party to\
Q: What is the judge ad hoc? \ a contentious case before the International Court of Justice does not have a regular\
\ national sitting as judge. The answer is (A).\n\nQ: When 'consent' can serve as\
(A) If a party to a contentious case before the ICJ does not have a national sitting \ a circumstance precluding the wrongfulness of a State conduct?\n(A) Consent can\
as judge, it is entitled to nominate someone as a judge solely for that case, with \ serve as a circumstance precluding the wrongfulness whenever it is given (B) Consent\
the title of judge ad hoc (B) Judge ad hoc is the member of the bench of the ICJ \ can never serve as a circumstance precluding wrongfulness (C) Consent can serve\
with a casting vote (C) Judge ad hoc is a surrogate judge, in case a judge is disqualified \ as a circumstance precluding wrongfulness, provided the consent is valid and to\
or passes away (D) Judge ad hoc is the judge that each party will always nominate \ the extent that the conduct remains within the limits of the consent given (D)\
in every contentious case \ Consent can always serve as a circumstance precluding wrongfulness, no matter\
\ which organ of the State gives it\nA: Let's think step by step. We refer to Wikipedia\
A: Let''s think step by step. We refer to Wikipedia articles on international law \ articles on international law for help. Valid consent can serve as a circumstance\
for help. As "ad hoc" implies, a judge ad hoc is appointed only for a specific case \ precluding the wrongfulness of a State conduct if the conduct remains within the\
or period, when a party to a contentious case before the International Court of \ limits of that consent, according to Chapter V of the Responsibility of States\
Justice does not have a regular national sitting as judge. The answer is (A). \ for Internationally Wrongful Acts, 2001, United Nations. The answer is (C).\n\n\
Q: Would a reservation to the definition of torture in the ICCPR be acceptable in\
\ contemporary practice?\n(A) This is an acceptable reservation if the reserving\
Q: When ''consent'' can serve as a circumstance precluding the wrongfulness of a \ country's legislation employs a different definition (B) This is an unacceptable\
State conduct? \ reservation because it contravenes the object and purpose of the ICCPR (C) This\
\ is an unacceptable reservation because the definition of torture in the ICCPR\
(A) Consent can serve as a circumstance precluding the wrongfulness whenever it \ is consistent with customary international law (D) This is an acceptable reservation\
is given (B) Consent can never serve as a circumstance precluding wrongfulness (C) \ because under general international law States have the right to enter reservations\
Consent can serve as a circumstance precluding wrongfulness, provided the consent \ to treaties\nA: Let's think step by step. We refer to Wikipedia articles on international\
is valid and to the extent that the conduct remains within the limits of the consent \ law for help. For it contravenes the object and purpose of the ICCPR, this is\
given (D) Consent can always serve as a circumstance precluding wrongfulness, no \ an unacceptable reservation in contemporary practice. The answer is (B).\n\nQ:\
matter which organ of the State gives it \ What types of force does Article 2(4) of the UN Charter prohibit?\n(A) Article\
\ 2(4) encompasses only armed force (B) Article 2(4) encompasses all types of force,\
A: Let''s think step by step. We refer to Wikipedia articles on international law \ including sanctions (C) Article 2(4) encompasses all interference in the domestic\
for help. Valid consent can serve as a circumstance precluding the wrongfulness \ affairs of States (D) Article 2(4) encompasses force directed only against a State's\
of a State conduct if the conduct remains within the limits of that consent, according \ territorial integrity\nA: Let's think step by step. We refer to Wikipedia articles\
to Chapter V of the Responsibility of States for Internationally Wrongful Acts, \ on international law for help. Article 2(4) of the UN Charter prohibits states\
2001, United Nations. The answer is (C). \ from using armed forces in their international relations. The answer is (A)."
"group": "mmlu_flan_cot_fewshot_humanities"
"include": "_mmlu_flan_cot_fewshot_template_yaml"
Q: Would a reservation to the definition of torture in the ICCPR be acceptable in "task": "mmlu_flan_cot_fewshot_international_law"
contemporary practice?
(A) This is an acceptable reservation if the reserving country''s legislation employs
a different definition (B) This is an unacceptable reservation because it contravenes
the object and purpose of the ICCPR (C) This is an unacceptable reservation because
the definition of torture in the ICCPR is consistent with customary international
law (D) This is an acceptable reservation because under general international law
States have the right to enter reservations to treaties
A: Let''s think step by step. We refer to Wikipedia articles on international law
for help. For it contravenes the object and purpose of the ICCPR, this is an unacceptable
reservation in contemporary practice. The answer is (B).
Q: What types of force does Article 2(4) of the UN Charter prohibit?
(A) Article 2(4) encompasses only armed force (B) Article 2(4) encompasses all types
of force, including sanctions (C) Article 2(4) encompasses all interference in the
domestic affairs of States (D) Article 2(4) encompasses force directed only against
a State''s territorial integrity
A: Let''s think step by step. We refer to Wikipedia articles on international law
for help. Article 2(4) of the UN Charter prohibits states from using armed forces
in their international relations. The answer is (A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_international_law
dataset_name: jurisprudence "dataset_name": "jurisprudence"
description: 'The following are multiple choice questions (with answers) about jurisprudence. "description": "The following are multiple choice questions (with answers) about jurisprudence.\n\
\nQ: Iverson Jewelers wrote a letter to Miller, 'We have received an exceptionally\
\ fine self winding Rolox watch which we will sell to you at a very favorable price.'\n\
Q: Iverson Jewelers wrote a letter to Miller, ''We have received an exceptionally (A) The letter is an offer to sell (B) A valid offer cannot be made by letter. (C)\
fine self winding Rolox watch which we will sell to you at a very favorable price.'' \ The letter contains a valid offer which will terminate within a reasonable time.\
\ (D) The letter lacks one of the essential elements of an offer.\nA: Let's think\
(A) The letter is an offer to sell (B) A valid offer cannot be made by letter. (C) \ step by step. We refer to Wikipedia articles on jurisprudence for help. An offer\
The letter contains a valid offer which will terminate within a reasonable time. \ shows the intent to enter into a mutually-beneficial contract with specific terms.\
(D) The letter lacks one of the essential elements of an offer. \ An offer can be made by a letter. While this letter indicates the willingness\
\ to sell, the lack of specific terms, such as transaction price and offer expiration\
A: Let''s think step by step. We refer to Wikipedia articles on jurisprudence for \ date, makes it an incomplete offer. The answer is (D).\n\nQ: Functions of the\
help. An offer shows the intent to enter into a mutually-beneficial contract with \ law include all but which of the following?\n(A) maximizing individual freedom\
specific terms. An offer can be made by a letter. While this letter indicates the \ (B) providing a basis for compromise (C) keeping the peace (D) promoting the principles\
willingness to sell, the lack of specific terms, such as transaction price and offer \ of the free enterprise system\nA: Let's think step by step. We refer to Wikipedia\
expiration date, makes it an incomplete offer. The answer is (D). \ articles on jurisprudence for help. Laws are fundamentally about helping resolve\
\ disputes between individuals, and therefore essential for maximizing individual\
\ freedom, providing a basis for compromise, and keeping the peace. The answer is\
Q: Functions of the law include all but which of the following? \ (D).\n\nQ: The ________ School of jurisprudence postulates that the law is based\
\ on what is \"correct.\"\n(A) Natural Law (B) Analytical (C) Historical (D) Sociological\n\
(A) maximizing individual freedom (B) providing a basis for compromise (C) keeping A: Let's think step by step. We refer to Wikipedia articles on jurisprudence for\
the peace (D) promoting the principles of the free enterprise system \ help. Natural Law School of jurisprudence focuses on the laws of nature, and states\
\ that the law should be based on ethics, morals, and what is \"correct\". Analytical\
A: Let''s think step by step. We refer to Wikipedia articles on jurisprudence for \ deals with the law as it already exists, Historical postulates that the law was\
help. Laws are fundamentally about helping resolve disputes between individuals, \ found and not made, and Sociological studies how the law and society impact each\
and therefore essential for maximizing individual freedom, providing a basis for \ other. The answer is (A).\n\nQ: Which word best summarizes Weber's explanation\
compromise, and keeping the peace. The answer is (D). \ of the development of formally rational law?\n(A) Authority. (B) Charisma. (C)\
\ Co-operation. (D) Capitalism.\nA: Let's think step by step. We refer to Wikipedia\
\ articles on jurisprudence for help. Weber explained the development of formal\
Q: The ________ School of jurisprudence postulates that the law is based on what \ rationality in laws as how the modern society moved from tradition to rationality,\
is "correct." \ where people decide actions based less on how they were culturally done and more\
\ on expected utilities. How rational individuals optimize efficiency of accomplishing\
(A) Natural Law (B) Analytical (C) Historical (D) Sociological \ tasks for higher rewards is a core principle of Capitalism. The answer is (D).\n\
\nQ: Which position does Rawls claim is the least likely to be adopted by the POP\
A: Let''s think step by step. We refer to Wikipedia articles on jurisprudence for \ (people in the original position)?\n(A) The POP would choose equality above liberty.\
help. Natural Law School of jurisprudence focuses on the laws of nature, and states \ (B) The POP would opt for the 'maximin' strategy. (C) The POP would opt for the\
that the law should be based on ethics, morals, and what is "correct". Analytical \ 'difference principle'. (D) The POP would reject the 'system of natural liberty.'\n\
deals with the law as it already exists, Historical postulates that the law was A: Let's think step by step. We refer to Wikipedia articles on jurisprudence for\
found and not made, and Sociological studies how the law and society impact each \ help. The POP would opt for the 'maximin' strategy, opt for the 'difference principle',\
other. The answer is (A). \ and reject the 'system of natural liberty', but the POP would not choose equality\
\ above liberty, since the POP assume both equal and free citizens. The answer is\
\ (A)."
Q: Which word best summarizes Weber''s explanation of the development of formally "group": "mmlu_flan_cot_fewshot_humanities"
rational law? "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_jurisprudence"
(A) Authority. (B) Charisma. (C) Co-operation. (D) Capitalism.
A: Let''s think step by step. We refer to Wikipedia articles on jurisprudence for
help. Weber explained the development of formal rationality in laws as how the modern
society moved from tradition to rationality, where people decide actions based less
on how they were culturally done and more on expected utilities. How rational individuals
optimize efficiency of accomplishing tasks for higher rewards is a core principle
of Capitalism. The answer is (D).
Q: Which position does Rawls claim is the least likely to be adopted by the POP
(people in the original position)?
(A) The POP would choose equality above liberty. (B) The POP would opt for the ''maximin''
strategy. (C) The POP would opt for the ''difference principle''. (D) The POP would
reject the ''system of natural liberty.''
A: Let''s think step by step. We refer to Wikipedia articles on jurisprudence for
help. The POP would opt for the ''maximin'' strategy, opt for the ''difference principle'',
and reject the ''system of natural liberty'', but the POP would not choose equality
above liberty, since the POP assume both equal and free citizens. The answer is
(A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_jurisprudence
dataset_name: logical_fallacies "dataset_name": "logical_fallacies"
description: 'The following are multiple choice questions (with answers) about logical "description": "The following are multiple choice questions (with answers) about logical\
fallacies. \ fallacies.\n\nQ: When an arguer causes confusion during refutation because of\
\ real or feigned lack of an ability to engage in refutation, that arguer may have\
\ committed the fallacy of\n(A) poor sportsmanship (B) appeal to compassion (C)\
Q: When an arguer causes confusion during refutation because of real or feigned \ argument against the person (D) ignorance of refutation\nA: Let's think step by\
lack of an ability to engage in refutation, that arguer may have committed the fallacy \ step. We refer to Wikipedia articles on logical fallacies for help. Ignorance\
of \ of refutation, one of Aristotle's original list of logical fallacies in his Organon,\
\ is when someone causes confusion in an argument through real or feigned inability\
(A) poor sportsmanship (B) appeal to compassion (C) argument against the person \ to engage in refutation, in order to win the argument. The answer is (D).\n\n\
(D) ignorance of refutation Q: The complex question fallacy consists of\n(A) arguing something is inferior just\
\ because it doesn't do something it was never intended to do. (B) including more\
A: Let''s think step by step. We refer to Wikipedia articles on logical fallacies \ than one claim in the proposition and treating proof for one claim as proof for\
for help. Ignorance of refutation, one of Aristotle''s original list of logical \ all the claims. (C) drawing a conclusion before examining the evidence, and only\
fallacies in his Organon, is when someone causes confusion in an argument through \ considering evidence that supports that conclusion. (D) asking a question that\
real or feigned inability to engage in refutation, in order to win the argument. \ includes either an unproven assumption or more than one question, thus making\
The answer is (D). \ a straightforward yes or no answer meaningless.\nA: Let's think step by step.\
\ We refer to Wikipedia articles on logical fallacies for help. The complex question\
\ fallacy is when someone makes a single yes or no answer to a question meaningless,\
Q: The complex question fallacy consists of \ by including either an unproven assumption or many questions. The latter is also\
\ known as the many questions fallacy. The answer is (D).\n\nQ: Arguing that what\
(A) arguing something is inferior just because it doesn''t do something it was never \ is true of the parts must be true of the whole is the fallacy of...\n(A) Division\
intended to do. (B) including more than one claim in the proposition and treating \ (B) Composition (C) Appeal to the person (D) Appeal to ignorance\nA: Let's think\
proof for one claim as proof for all the claims. (C) drawing a conclusion before \ step by step. We refer to Wikipedia articles on logical fallacies for help. Fallacy\
examining the evidence, and only considering evidence that supports that conclusion. \ of composition occurs when someone argues what is true of the parts must be true\
(D) asking a question that includes either an unproven assumption or more than one \ of the whole. The answer is (B).\n\nQ: Which of the following is true of a valid\
question, thus making a straightforward yes or no answer meaningless. \ categorical syllogism?\n(A) The minor premise must deny the antecedent (B) The\
\ major premise must affirm the consequent (C) The middle term must be used in at\
A: Let''s think step by step. We refer to Wikipedia articles on logical fallacies \ least one premise in a universal or unqualified sense (D) All of the above\nA:\
for help. The complex question fallacy is when someone makes a single yes or no \ Let's think step by step. We refer to Wikipedia articles on logical fallacies\
answer to a question meaningless, by including either an unproven assumption or \ for help. A valid categorical syllogism must satisfy several conditions: (1) the\
many questions. The latter is also known as the many questions fallacy. The answer \ syllogism must have exactly three terms (2) every term of the syllogism must be\
is (D). \ used twice exactly, (3) a term may be used only once in any premise, and (4) the\
\ middle term must be used in at least one premise in a universal or unqualified\
\ sense, etc. Only (C) is true. The answer is (C).\n\nQ: If someone attacks the\
Q: Arguing that what is true of the parts must be true of the whole is the fallacy \ character of an opposing arguer, instead of responding to that opponent's arguments,\
of... \ the first person has probably committed which of the following fallacies?\n(A)\
\ tu quoque (B) horse laugh (C) argument against the person (D) ignoratio elenchi\n\
(A) Division (B) Composition (C) Appeal to the person (D) Appeal to ignorance A: Let's think step by step. We refer to Wikipedia articles on logical fallacies\
\ for help. The argument against the person fallacy occurs when someone irrelevantly\
A: Let''s think step by step. We refer to Wikipedia articles on logical fallacies \ attacks the character of an opposing arguer, instead of addressing that opponent's\
for help. Fallacy of composition occurs when someone argues what is true of the \ arguments. The answer is (C)."
parts must be true of the whole. The answer is (B). "group": "mmlu_flan_cot_fewshot_humanities"
"include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_logical_fallacies"
Q: Which of the following is true of a valid categorical syllogism?
(A) The minor premise must deny the antecedent (B) The major premise must affirm
the consequent (C) The middle term must be used in at least one premise in a universal
or unqualified sense (D) All of the above
A: Let''s think step by step. We refer to Wikipedia articles on logical fallacies
for help. A valid categorical syllogism must satisfy several conditions: (1) the
syllogism must have exactly three terms (2) every term of the syllogism must be
used twice exactly, (3) a term may be used only once in any premise, and (4) the
middle term must be used in at least one premise in a universal or unqualified sense,
etc. Only (C) is true. The answer is (C).
Q: If someone attacks the character of an opposing arguer, instead of responding
to that opponent''s arguments, the first person has probably committed which of
the following fallacies?
(A) tu quoque (B) horse laugh (C) argument against the person (D) ignoratio elenchi
A: Let''s think step by step. We refer to Wikipedia articles on logical fallacies
for help. The argument against the person fallacy occurs when someone irrelevantly
attacks the character of an opposing arguer, instead of addressing that opponent''s
arguments. The answer is (C).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_logical_fallacies
dataset_name: machine_learning "dataset_name": "machine_learning"
description: "The following are multiple choice questions (with answers) about machine\ "description": "The following are multiple choice questions (with answers) about machine\
\ learning.\n\nQ: Which image data augmentation is most common for natural images?\n\ \ learning.\n\nQ: Which image data augmentation is most common for natural images?\n\
(A) random crop and horizontal flip (B) random crop and vertical flip (C) posterization\ (A) random crop and horizontal flip (B) random crop and vertical flip (C) posterization\
\ (D) dithering\nA: Let's think step by step. Data augmentation is used to increase\ \ (D) dithering\nA: Let's think step by step. Data augmentation is used to increase\
...@@ -12,48 +12,49 @@ description: "The following are multiple choice questions (with answers) about m ...@@ -12,48 +12,49 @@ description: "The following are multiple choice questions (with answers) about m
\ learning we consider a binary split according to whether the attribute is above\ \ learning we consider a binary split according to whether the attribute is above\
\ or below some threshold. Pat suggests that instead we should just have a multiway\ \ or below some threshold. Pat suggests that instead we should just have a multiway\
\ split with one branch for each of the distinct values of the attribute. From the\ \ split with one branch for each of the distinct values of the attribute. From the\
\ list below choose the single biggest problem with Pat\u2019s suggestion:\n(A)\ \ list below choose the single biggest problem with Pats suggestion:\n(A) It is\
\ It is too computationally expensive. (B) It would probably result in a decision\ \ too computationally expensive. (B) It would probably result in a decision tree\
\ tree that scores badly on the training set and a testset. (C) It would probably\ \ that scores badly on the training set and a testset. (C) It would probably result\
\ result in a decision tree that scores well on the training set but badly on a\ \ in a decision tree that scores well on the training set but badly on a testset.\
\ testset. (D) It would probably result in a decision tree that scores well on a\ \ (D) It would probably result in a decision tree that scores well on a testset\
\ testset but badly on a training set.\nA: Let's think step by step. Because the\ \ but badly on a training set.\nA: Let's think step by step. Because the input is\
\ input is real valued, it is unlikely that the same values appear both at training\ \ real valued, it is unlikely that the same values appear both at training and test\
\ and test time. This means that while such a decision tree could yield good performance\ \ time. This means that while such a decision tree could yield good performance\
\ on the training data, when evaluated on the test data it will perform badly because\ \ on the training data, when evaluated on the test data it will perform badly because\
\ the decision tree won\u2019t know what to do with numbers that did not appear\ \ the decision tree won’t know what to do with numbers that did not appear in the\
\ in the training data. The answer is (C).\n\nQ: You are reviewing papers for the\ \ training data. The answer is (C).\n\nQ: You are reviewing papers for the World’s\
\ World\u2019s Fanciest Machine Learning Conference, and you see submissions with\ \ Fanciest Machine Learning Conference, and you see submissions with the following\
\ the following claims. Which ones would you consider accepting?\n(A) My method\ \ claims. Which ones would you consider accepting?\n(A) My method achieves a training\
\ achieves a training error lower than all previous methods! (B) My method achieves\ \ error lower than all previous methods! (B) My method achieves a test error lower\
\ a test error lower than all previous methods! (Footnote: When regularisation parameter\ \ than all previous methods! (Footnote: When regularisation parameter λ is chosen\
\ \u03BB is chosen so as to minimise test error.) (C) My method achieves a test\ \ so as to minimise test error.) (C) My method achieves a test error lower than\
\ all previous methods! (Footnote: When regularisation parameter λ is chosen so\
\ as to minimise cross-validaton error.) (D) My method achieves a cross-validation\
\ error lower than all previous methods! (Footnote: When regularisation parameter\ \ error lower than all previous methods! (Footnote: When regularisation parameter\
\ \u03BB is chosen so as to minimise cross-validaton error.) (D) My method achieves\ \ λ is chosen so as to minimise cross-validaton error.)\nA: Let's think step by\
\ a cross-validation error lower than all previous methods! (Footnote: When regularisation\ \ step. In machine learning, we train with some data and fixed hyperparameters and\
\ parameter \u03BB is chosen so as to minimise cross-validaton error.)\nA: Let's\ \ the training error can be arbitrarily low, so (A) can’t be right. Then, one compares\
\ think step by step. In machine learning, we train with some data and fixed hyperparameters\ \ different hyperparameters by selecting the model with the lowest cross-validation\
\ and the training error can be arbitrarily low, so (A) can\u2019t be right. Then,\ \ error, this means that (B) and (D) are not the right procedure. The only relevant\
\ one compares different hyperparameters by selecting the model with the lowest\ \ number after these is the test error and thus (C) is the right answer. The answer\
\ cross-validation error, this means that (B) and (D) are not the right procedure.\ \ is (C).\n\nQ: A 6-sided die is rolled 15 times and the results are: side 1 comes\
\ The only relevant number after these is the test error and thus (C) is the right\ \ up 0 times; side 2: 1 time; side 3: 2 times; side 4: 3 times; side 5: 4 times;\
\ answer. The answer is (C).\n\nQ: A 6-sided die is rolled 15 times and the results\ \ side 6: 5 times. Based on these results, what is the probability of side 3 coming\
\ are: side 1 comes up 0 times; side 2: 1 time; side 3: 2 times; side 4: 3 times;\ \ up when using Add-1 Smoothing?\n(A) 2.0/15 (B) 1.0/7 (C) 3.0/16 (D) 1.0/5\nA:\
\ side 5: 4 times; side 6: 5 times. Based on these results, what is the probability\ \ Let's think step by step. Add-1 smoothing adds the value of one to the different\
\ of side 3 coming up when using Add-1 Smoothing?\n(A) 2.0/15 (B) 1.0/7 (C) 3.0/16\ \ counts and then normalizes the probabilities accordingly. The counts after adding\
\ (D) 1.0/5\nA: Let's think step by step. Add-1 smoothing adds the value of one\ \ one will be: side 1 comes up 1 time; side 2: 2 times; side 3: 3 times; side 4:\
\ to the different counts and then normalizes the probabilities accordingly. The\ \ 4 times; side 5: 5 times; side 6: 6 times. The number of sum one die rolls will\
\ counts after adding one will be: side 1 comes up 1 time; side 2: 2 times; side\ \ be 21, so the probability of drawing a three is 3/21 = 1/7. The answer is (B).\n\
\ 3: 3 times; side 4: 4 times; side 5: 5 times; side 6: 6 times. The number of sum\ \nQ: To achieve an 0/1 loss estimate that is less than 1 percent of the true 0/1\
\ one die rolls will be 21, so the probability of drawing a three is 3/21 = 1/7.\ \ loss (with probability 95%), according to Hoeffding's inequality the IID test\
\ The answer is (B).\n\nQ: To achieve an 0/1 loss estimate that is less than 1 percent\ \ set must have how many examples?\n(A) around 10 examples (B) around 100 examples\
\ of the true 0/1 loss (with probability 95%), according to Hoeffding's inequality\ \ (C) between 100 and 500 examples (D) more than 1000 examples\nA: Let's think step\
\ the IID test set must have how many examples?\n(A) around 10 examples (B) around\ \ by step. By the Hoeffding’s inequality, we expect that with 95% probability the\
\ 100 examples (C) between 100 and 500 examples (D) more than 1000 examples\nA:\ \ in-sample and out-of-sample errors differ by epsilon when we have N samples if\
\ Let's think step by step. By the Hoeffding\u2019s inequality, we expect that with\ \ 2 exp(-2 epsilon^2 N)<0.05, this implies that N > -1/(2*epsilon**2) log ( 0.05/2\
\ 95% probability the in-sample and out-of-sample errors differ by epsilon when\ \ )= log (40)*5000. Since log(40)>1, we have that one needs more than 1000 examples.\
\ we have N samples if 2 exp(-2 epsilon^2 N)<0.05, this implies that N > -1/(2*epsilon**2)\ \ The answer is (D)."
\ log ( 0.05/2 )= log (40)*5000. Since log(40)>1, we have that one needs more than\ "group": "mmlu_flan_cot_fewshot_stem"
\ 1000 examples. The answer is (D)." "include": "_mmlu_flan_cot_fewshot_template_yaml"
include: _mmlu_flan_cot_fewshot_template_yaml "task": "mmlu_flan_cot_fewshot_machine_learning"
task: mmlu_flan_cot_fewshot_machine_learning
dataset_name: management "dataset_name": "management"
description: 'The following are multiple choice questions (with answers) about management. "description": "The following are multiple choice questions (with answers) about management.\n\
\nQ: How can organisational structures that are characterised by democratic and\
\ inclusive styles of management be described?\n(A) Hierarchical (B) Bureaucratic\
Q: How can organisational structures that are characterised by democratic and inclusive \ (C) Flat (D) Functional\nA: Let's think step by step. We refer to Wikipedia articles\
styles of management be described? \ on management for help. Flat organizational structures are characterized by democratic\
\ and inclusive styles of management, and have few (if any) levels of management\
(A) Hierarchical (B) Bureaucratic (C) Flat (D) Functional \ between the workers and managers. The answer is (C).\n\nQ: Hygiene factors are\
\ associated with which writer?\n(A) Frederick Hertzberg (B) D.C. McClelland (C)\
A: Let''s think step by step. We refer to Wikipedia articles on management for help. \ Abraham Maslow (D) Douglas McGregor\nA: Let's think step by step. We refer to\
Flat organizational structures are characterized by democratic and inclusive styles \ Wikipedia articles on management for help. Hygiene factors include compensation,\
of management, and have few (if any) levels of management between the workers and \ company policies, supervision, interpersonal relations, and work environments.\
managers. The answer is (C). \ Hertzberg lists them as factors that cannot motivate employees but can minimize\
\ job dissatisfaction. The answer is (A).\n\nQ: What characteristic is not a key\
\ feature of the 'open systems' model of management?\n(A) Morale (B) Innovation\
Q: Hygiene factors are associated with which writer? \ (C) Growth resource (D) Adaptation\nA: Let's think step by step. We refer to Wikipedia\
\ articles on management for help. The key characteristics of an open system in\
(A) Frederick Hertzberg (B) D.C. McClelland (C) Abraham Maslow (D) Douglas McGregor \ management include innovation, growth resource, and adaption, but do not include\
\ morale. The answer is (A).\n\nQ: Which element of the cultural web forms regalia?\n\
A: Let''s think step by step. We refer to Wikipedia articles on management for help. (A) Symbols (B) Rituals and routines (C) Power structures (D) Control systems\n\
Hygiene factors include compensation, company policies, supervision, interpersonal A: Let's think step by step. We refer to Wikipedia articles on management for help.\
relations, and work environments. Hertzberg lists them as factors that cannot motivate \ The cultural web is a tool for mapping an organization's culture, where symbols\
employees but can minimize job dissatisfaction. The answer is (A). \ form the regalia that visually expresses the values that the organization holds\
\ as important. The answer is (A).\n\nQ: What are the two main dimensions of the\
\ Ohio Studies into leadership?\n(A) Starting position and end position (B) Initial\
Q: What characteristic is not a key feature of the ''open systems'' model of management? \ environment and changed environment (C) Organisational structure and conditioning\
\ (D) Initiating structure and considerations\nA: Let's think step by step. We refer\
(A) Morale (B) Innovation (C) Growth resource (D) Adaptation \ to Wikipedia articles on management for help. The Ohio State Leadership Studies\
\ conducted in the 1940s identified initiating structure and consideration as the\
A: Let''s think step by step. We refer to Wikipedia articles on management for help. \ two main dimensions of leader behavior. The answer is (D)."
The key characteristics of an open system in management include innovation, growth "group": "mmlu_flan_cot_fewshot_other"
resource, and adaption, but do not include morale. The answer is (A). "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_management"
Q: Which element of the cultural web forms regalia?
(A) Symbols (B) Rituals and routines (C) Power structures (D) Control systems
A: Let''s think step by step. We refer to Wikipedia articles on management for help.
The cultural web is a tool for mapping an organization''s culture, where symbols
form the regalia that visually expresses the values that the organization holds
as important. The answer is (A).
Q: What are the two main dimensions of the Ohio Studies into leadership?
(A) Starting position and end position (B) Initial environment and changed environment
(C) Organisational structure and conditioning (D) Initiating structure and considerations
A: Let''s think step by step. We refer to Wikipedia articles on management for help.
The Ohio State Leadership Studies conducted in the 1940s identified initiating structure
and consideration as the two main dimensions of leader behavior. The answer is (D).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_management
dataset_name: marketing "dataset_name": "marketing"
description: 'The following are multiple choice questions (with answers) about marketing. "description": "The following are multiple choice questions (with answers) about marketing.\n\
\nQ: Although the content and quality can be as controlled as direct mail, response\
\ rates of this medium are lower because of the lack of a personal address mechanism.\
Q: Although the content and quality can be as controlled as direct mail, response \ This media format is known as:\n(A) Care lines. (B) Direct mail. (C) Inserts.\
rates of this medium are lower because of the lack of a personal address mechanism. \ (D) Door to door.\nA: Let's think step by step. We refer to Wikipedia articles\
This media format is known as: \ on marketing for help. Door to door marketing delivers non-addressed items within\
\ all buildings within a geographic area. While it can control the content and quality\
(A) Care lines. (B) Direct mail. (C) Inserts. (D) Door to door. \ as well as direct mail marketing, its response rate is lower because of the lack\
\ of a personal address mechanism. The answer is (D).\n\nQ: In an organization,\
A: Let''s think step by step. We refer to Wikipedia articles on marketing for help. \ the group of people tasked with buying decisions is referred to as the _______________.\n\
Door to door marketing delivers non-addressed items within all buildings within (A) Outsourcing unit. (B) Procurement centre. (C) Chief executive unit. (D) Decision-making\
a geographic area. While it can control the content and quality as well as direct \ unit.\nA: Let's think step by step. We refer to Wikipedia articles on marketing\
mail marketing, its response rate is lower because of the lack of a personal address \ for help. In an organization, the group of the people tasked with buying decision\
mechanism. The answer is (D). \ is referred to as the decision-making unit. The answer is (D).\n\nQ: The single\
\ group within society that is most vulnerable to reference group influence is:\n\
(A) The older consumer who feels somewhat left out of things. (B) The married women,\
Q: In an organization, the group of people tasked with buying decisions is referred \ many of whom feel a need for stability in their lives. (C) New immigrants who\
to as the _______________. \ really want to assimilate into their new culture. (D) Children, who base most\
\ of their buying decisions on outside influences.\nA: Let's think step by step.\
(A) Outsourcing unit. (B) Procurement centre. (C) Chief executive unit. (D) Decision-making \ We refer to Wikipedia articles on marketing for help. Children, who mostly based\
unit. \ their buying decisions on outside influences, are the single group within society\
\ that is more vulnerable to reference group influence. The answer is (D).\n\nQ:\
A: Let''s think step by step. We refer to Wikipedia articles on marketing for help. \ Which of the following is an assumption in Maslow's hierarchy of needs?\n(A) Needs\
In an organization, the group of the people tasked with buying decision is referred \ are dependent on culture and also on social class. (B) Lower-level needs must\
to as the decision-making unit. The answer is (D). \ be at least partially satisfied before higher needs can affect behaviour. (C)\
\ Needs are not prioritized or arranged in any particular order. (D) Satisfied needs\
\ are motivators, and new needs emerge when current needs remain unmet.\nA: Let's\
Q: The single group within society that is most vulnerable to reference group influence \ think step by step. We refer to Wikipedia articles on marketing for help. Maslow's\
is: \ hierarchy of needs, from the bottom upwards, are physiological (food and clothing),\
\ safety, love and belonging needs, esteem, and self-actualization. Lower-level\
(A) The older consumer who feels somewhat left out of things. (B) The married women, \ needs must be at least partially satisfied before higher ones can affect behavior.\
many of whom feel a need for stability in their lives. (C) New immigrants who really \ The answer is (B).\n\nQ: _____________ is a natural outcome when combining demographic\
want to assimilate into their new culture. (D) Children, who base most of their \ and geographic variables.\n(A) Geodemographics (B) Product differentiation. (C)\
buying decisions on outside influences. \ ANSOFF matrix. (D) Brand management.\nA: Let's think step by step. We refer to\
\ Wikipedia articles on marketing for help. Geodemographics is a natural outcome\
A: Let''s think step by step. We refer to Wikipedia articles on marketing for help. \ when combining demographic and geographic variables. The answer is (A)."
Children, who mostly based their buying decisions on outside influences, are the "group": "mmlu_flan_cot_fewshot_other"
single group within society that is more vulnerable to reference group influence. "include": "_mmlu_flan_cot_fewshot_template_yaml"
The answer is (D). "task": "mmlu_flan_cot_fewshot_marketing"
Q: Which of the following is an assumption in Maslow''s hierarchy of needs?
(A) Needs are dependent on culture and also on social class. (B) Lower-level needs
must be at least partially satisfied before higher needs can affect behaviour. (C)
Needs are not prioritized or arranged in any particular order. (D) Satisfied needs
are motivators, and new needs emerge when current needs remain unmet.
A: Let''s think step by step. We refer to Wikipedia articles on marketing for help.
Maslow''s hierarchy of needs, from the bottom upwards, are physiological (food and
clothing), safety, love and belonging needs, esteem, and self-actualization. Lower-level
needs must be at least partially satisfied before higher ones can affect behavior.
The answer is (B).
Q: _____________ is a natural outcome when combining demographic and geographic
variables.
(A) Geodemographics (B) Product differentiation. (C) ANSOFF matrix. (D) Brand management.
A: Let''s think step by step. We refer to Wikipedia articles on marketing for help.
Geodemographics is a natural outcome when combining demographic and geographic variables.
The answer is (A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_marketing
dataset_name: medical_genetics "dataset_name": "medical_genetics"
description: 'The following are multiple choice questions (with answers) about medical "description": "The following are multiple choice questions (with answers) about medical\
genetics. \ genetics.\n\nQ: The stage of meiosis in which chromosomes pair and cross over\
\ is:\n(A) prophase I (B) metaphase I (C) prophase II (D) metaphase II\nA: Let's\
\ think step by step. We refer to Wikipedia articles on medical genetics for help.\
Q: The stage of meiosis in which chromosomes pair and cross over is: \ Prophase I is the stage of meiosis where homologous chromosomes pair with each\
\ other and exchange genetic material. The answer is (A).\n\nQ: DNA ligase is\n\
(A) prophase I (B) metaphase I (C) prophase II (D) metaphase II (A) an enzyme that joins fragments in normal DNA replication (B) an enzyme of bacterial\
\ origin which cuts DNA at defined base sequences (C) an enzyme that facilitates\
A: Let''s think step by step. We refer to Wikipedia articles on medical genetics \ transcription of specific genes (D) an enzyme which limits the level to which\
for help. Prophase I is the stage of meiosis where homologous chromosomes pair with \ a particular nutrient reaches\nA: Let's think step by step. We refer to Wikipedia\
each other and exchange genetic material. The answer is (A). \ articles on medical genetics for help. DNA ligase is a type of enzyme (EC 6.5.1.1)\
\ responsible for joining DNA strands together by catalyzing a phosphodiester bond.\
\ The answer is (A).\n\nQ: Which of the following conditions does not show multifactorial\
Q: DNA ligase is \ inheritance?\n(A) Pyloric stenosis (B) Schizophrenia (C) Spina bifida (neural\
\ tube defects) (D) Marfan syndrome\nA: Let's think step by step. We refer to Wikipedia\
(A) an enzyme that joins fragments in normal DNA replication (B) an enzyme of bacterial \ articles on medical genetics for help. Multifactorial inheritance is when more\
origin which cuts DNA at defined base sequences (C) an enzyme that facilitates transcription \ than a single factor is responsible for causing a given trait or health problem.\
of specific genes (D) an enzyme which limits the level to which a particular nutrient \ Genes cannot be the only factor. Marfan syndrome, on the other hand, requires\
reaches \ only one abnormal copy of the of the Marfan gene, from one parent, to inherit\
\ the trait. The answer is (D).\n\nQ: A gene showing codominance\n(A) has both alleles\
A: Let''s think step by step. We refer to Wikipedia articles on medical genetics \ independently expressed in the heterozygote (B) has one allele dominant to the\
for help. DNA ligase is a type of enzyme (EC 6.5.1.1) responsible for joining DNA \ other (C) has alleles tightly linked on the same chromosome (D) has alleles expressed\
strands together by catalyzing a phosphodiester bond. The answer is (A). \ at the same time in development\nA: Let's think step by step. We refer to Wikipedia\
\ articles on medical genetics for help. Codominance, as it relates to genetics,\
\ refers to a type of genetic inheritance where the phenotype of both the parents\
Q: Which of the following conditions does not show multifactorial inheritance? \ is easily observed in the offspring. A heterozygote is an individual having two\
\ different alleles of a gene. The answer is (A).\n\nQ: Large triplet repeat expansions\
(A) Pyloric stenosis (B) Schizophrenia (C) Spina bifida (neural tube defects) (D) \ can be detected by:\n(A) polymerase chain reaction. (B) single strand conformational\
Marfan syndrome \ polymorphism analysis. (C) Southern blotting. (D) Western blotting.\nA: Let's\
\ think step by step. We refer to Wikipedia articles on medical genetics for help.\
A: Let''s think step by step. We refer to Wikipedia articles on medical genetics \ A Southern blot is a method in molecular biology for detecting specific DNA sequences\
for help. Multifactorial inheritance is when more than a single factor is responsible \ in a sample. Large triplet repeat expansions are usually detected with this method.\
for causing a given trait or health problem. Genes cannot be the only factor. Marfan \ The answer is (C)."
syndrome, on the other hand, requires only one abnormal copy of the of the Marfan "group": "mmlu_flan_cot_fewshot_other"
gene, from one parent, to inherit the trait. The answer is (D). "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_medical_genetics"
Q: A gene showing codominance
(A) has both alleles independently expressed in the heterozygote (B) has one allele
dominant to the other (C) has alleles tightly linked on the same chromosome (D)
has alleles expressed at the same time in development
A: Let''s think step by step. We refer to Wikipedia articles on medical genetics
for help. Codominance, as it relates to genetics, refers to a type of genetic inheritance
where the phenotype of both the parents is easily observed in the offspring. A heterozygote
is an individual having two different alleles of a gene. The answer is (A).
Q: Large triplet repeat expansions can be detected by:
(A) polymerase chain reaction. (B) single strand conformational polymorphism analysis.
(C) Southern blotting. (D) Western blotting.
A: Let''s think step by step. We refer to Wikipedia articles on medical genetics
for help. A Southern blot is a method in molecular biology for detecting specific
DNA sequences in a sample. Large triplet repeat expansions are usually detected
with this method. The answer is (C).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_medical_genetics
dataset_name: miscellaneous "dataset_name": "miscellaneous"
description: 'The following are multiple choice questions (with answers) about miscellaneous. "description": "The following are multiple choice questions (with answers) about miscellaneous.\n\
\nQ: Which of these songs was a Top 10 hit for the rock band The Police?\n(A) 'Radio\
\ Ga-Ga' (B) 'Ob-la-di Ob-la-da' (C) 'De Do Do Do De Da Da Da' (D) 'In-a-Gadda-Da-Vida'\n\
Q: Which of these songs was a Top 10 hit for the rock band The Police? A: Let's think step by step. We refer to Wikipedia for help. Radio Ga-Ga is by Queen.\
\ Ob-la-di Ob-la-da is by The Beatles. And In-a-Gadda-Da-Vida is by Iron Butterfly.\
(A) ''Radio Ga-Ga'' (B) ''Ob-la-di Ob-la-da'' (C) ''De Do Do Do De Da Da Da'' (D) \ Leaving 'De Do Do Do De Da Da Da' as the only song by The Police, and also a Top\
''In-a-Gadda-Da-Vida'' \ 10 hit. The answer is (C).\n\nQ: What place is named in the title of the 1979\
\ live album by rock legends Cheap Trick?\n(A) Budapest (B) Budokan (C) Bhutan (D)\
A: Let''s think step by step. We refer to Wikipedia for help. Radio Ga-Ga is by \ Britain\nA: Let's think step by step. We refer to Wikipedia for help. Nippon Budokan\
Queen. Ob-la-di Ob-la-da is by The Beatles. And In-a-Gadda-Da-Vida is by Iron Butterfly. \ is an indoor arena in Tokyo, Japan renowned for hosting rock music concerts including\
Leaving ''De Do Do Do De Da Da Da'' as the only song by The Police, and also a Top \ Cheap Trick in 1978. 'Cheap Trick at Budokan' became the name of their album.\
10 hit. The answer is (C). \ The answer is (B).\n\nQ: What is produced during photosynthesis?\n(A) hydrogen\
\ (B) nylon (C) oxygen (D) light\nA: Let's think step by step. We refer to Wikipedia\
\ for help. Photosynthesis is the process in which green plants use the green pigment\
Q: What place is named in the title of the 1979 live album by rock legends Cheap \ chlorophyll to synthesize foods with water and carbon dioxide. Oxygen is the byproduct\
Trick? \ of this process. The answer is (C).\n\nQ: Who is the shortest man to ever win\
\ an NBA slam dunk competition?\n(A) Anthony 'Spud' Webb (B) Michael 'Air' Jordan\
(A) Budapest (B) Budokan (C) Bhutan (D) Britain \ (C) Tyrone 'Muggsy' Bogues (D) Julius 'Dr J' Erving\nA: Let's think step by step.\
\ We refer to Wikipedia for help. In 1986, Spud Webb, standing only 5'7\" became\
A: Let''s think step by step. We refer to Wikipedia for help. Nippon Budokan is \ the shortest NBA player in history to win an official slam dunk contest. The answer\
an indoor arena in Tokyo, Japan renowned for hosting rock music concerts including \ is (A).\n\nQ: How many axles does a standard automobile have?\n(A) one (B) two\
Cheap Trick in 1978. ''Cheap Trick at Budokan'' became the name of their album. \ (C) four (D) eight\nA: Let's think step by step. We refer to Wikipedia for help.\
The answer is (B). \ Most cars have two axles to rotate the wheels.. The answer is (B)."
"group": "mmlu_flan_cot_fewshot_other"
"include": "_mmlu_flan_cot_fewshot_template_yaml"
Q: What is produced during photosynthesis? "task": "mmlu_flan_cot_fewshot_miscellaneous"
(A) hydrogen (B) nylon (C) oxygen (D) light
A: Let''s think step by step. We refer to Wikipedia for help. Photosynthesis is
the process in which green plants use the green pigment chlorophyll to synthesize
foods with water and carbon dioxide. Oxygen is the byproduct of this process. The
answer is (C).
Q: Who is the shortest man to ever win an NBA slam dunk competition?
(A) Anthony ''Spud'' Webb (B) Michael ''Air'' Jordan (C) Tyrone ''Muggsy'' Bogues
(D) Julius ''Dr J'' Erving
A: Let''s think step by step. We refer to Wikipedia for help. In 1986, Spud Webb,
standing only 5''7" became the shortest NBA player in history to win an official
slam dunk contest. The answer is (A).
Q: How many axles does a standard automobile have?
(A) one (B) two (C) four (D) eight
A: Let''s think step by step. We refer to Wikipedia for help. Most cars have two
axles to rotate the wheels.. The answer is (B).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_miscellaneous
dataset_name: moral_disputes "dataset_name": "moral_disputes"
description: 'The following are multiple choice questions (with answers) about moral "description": "The following are multiple choice questions (with answers) about moral\
disputes. \ disputes.\n\nQ: Baron admits that the versions of the ticking bomb hypothetical\
\ she discusses are \"stunningly stupid,\" but she claims this is actually evidence\
\ of\n(A) the stupidity of most traditional philosophical examples. (B) a general\
Q: Baron admits that the versions of the ticking bomb hypothetical she discusses \ lack of intelligence among people with advanced degrees. (C) the wrongness of\
are "stunningly stupid," but she claims this is actually evidence of \ torture. (D) the readiness on the part of many intelligent people to see torture\
\ as the best solution to deal with terrorism.\nA: Let's think step by step. We\
(A) the stupidity of most traditional philosophical examples. (B) a general lack \ refer to Wikipedia articles on moral disputes for help. The ticking bomb hypothetical\
of intelligence among people with advanced degrees. (C) the wrongness of torture. \ poses a problem where many people will die to an exploding bomb, if the hypothetical\
(D) the readiness on the part of many intelligent people to see torture as the best \ terrorist does not disclose how to defuse it. Baron sees this hypothetical as\
solution to deal with terrorism. \ silly, but its prevalence does suggest intelligent people, particularly utilitarians,\
\ see torture as justifiable to save the lives in this scenario. The answer is (D).\n\
A: Let''s think step by step. We refer to Wikipedia articles on moral disputes for \nQ: A fertilized ovum is also known as\n(A) a zygote. (B) an embryo. (C) a viability.\
help. The ticking bomb hypothetical poses a problem where many people will die to \ (D) a blastocyst.\nA: Let's think step by step. We refer to Wikipedia articles\
an exploding bomb, if the hypothetical terrorist does not disclose how to defuse \ on moral disputes for help. Once a single sperm penetrates the layers of an egg\
it. Baron sees this hypothetical as silly, but its prevalence does suggest intelligent \ to form a new cell, that cell is called a zygote. The answer is (A).\n\nQ: Pence\
people, particularly utilitarians, see torture as justifiable to save the lives \ compares six different cases of reproduction, from natural twinning to SCNT. What\
in this scenario. The answer is (D). \ conclusion does he draw from this comparison?\n(A) SCNT is not a different kind\
\ of reproduction because there are no morally relevant differences between it and\
\ other permissible means of reproduction. (B) Because there is a low risk of harm\
Q: A fertilized ovum is also known as \ for natural twinning, there will be a low risk of harm for SCNT. (C) both A and\
\ B (D) neither A nor B\nA: Let's think step by step. We refer to Wikipedia articles\
(A) a zygote. (B) an embryo. (C) a viability. (D) a blastocyst. \ on moral disputes for help. SCNT, known as 'somatic cell nuclear transfer', is\
\ a technique in which the nucleus of a somatic (body) cell is transferred to the\
A: Let''s think step by step. We refer to Wikipedia articles on moral disputes for \ cytoplasm of an enucleated egg. This is used in farm animal cloning. Pence may\
help. Once a single sperm penetrates the layers of an egg to form a new cell, that \ compare this to other processes of reproduction, and see little moral separation.\
cell is called a zygote. The answer is (A). \ The answer is (A).\n\nQ: Which of the following is an intrinsic good, according\
\ to Singer?\n(A) being in the upper class (B) the ability to learn and solve problems\
\ (C) being more beautiful than most of the population (D) being shorter than average\n\
Q: Pence compares six different cases of reproduction, from natural twinning to A: Let's think step by step. We refer to Wikipedia articles on moral disputes for\
SCNT. What conclusion does he draw from this comparison? \ help. Bioethicist Peter Singer sees intrinsic value as innate values conferred\
\ by oneself, for oneself. Innanimate objects can be beautiful, short, or have some\
(A) SCNT is not a different kind of reproduction because there are no morally relevant \ valuable criteria, but capabilities are intrinsically good. The answer is (B).\n\
differences between it and other permissible means of reproduction. (B) Because \nQ: According to Metz, what is wrong with consequentialist arguments against capital\
there is a low risk of harm for natural twinning, there will be a low risk of harm \ punishment based on African values?\n(A) It is unclear as of yet whether or not\
for SCNT. (C) both A and B (D) neither A nor B \ capital punishment deters harm to the community. (B) It is unclear as of yet whether\
\ or not capital punishment deters harm to any individuals. (C) Consequentialism\
A: Let''s think step by step. We refer to Wikipedia articles on moral disputes for \ is not supported by African values. (D) Even though consequentialism is supported\
help. SCNT, known as ''somatic cell nuclear transfer'', is a technique in which \ by African values, no consequentialist arguments framed in terms of African values\
the nucleus of a somatic (body) cell is transferred to the cytoplasm of an enucleated \ have been offered.\nA: Let's think step by step. We refer to Wikipedia articles\
egg. This is used in farm animal cloning. Pence may compare this to other processes \ on moral disputes for help. Thaddeus Metz is a humanities research professor focusing\
of reproduction, and see little moral separation. The answer is (A). \ on ethical philosophy in South Africa. Metz has written the death penalty is unjustified\
\ as it treats individuals as incapable of communal relations. It is unclear that\
\ capital punishment is to the benefit of, or a deterrent of harm to the community.\
Q: Which of the following is an intrinsic good, according to Singer? \ The answer is (A)."
"group": "mmlu_flan_cot_fewshot_humanities"
(A) being in the upper class (B) the ability to learn and solve problems (C) being "include": "_mmlu_flan_cot_fewshot_template_yaml"
more beautiful than most of the population (D) being shorter than average "task": "mmlu_flan_cot_fewshot_moral_disputes"
A: Let''s think step by step. We refer to Wikipedia articles on moral disputes for
help. Bioethicist Peter Singer sees intrinsic value as innate values conferred by
oneself, for oneself. Innanimate objects can be beautiful, short, or have some valuable
criteria, but capabilities are intrinsically good. The answer is (B).
Q: According to Metz, what is wrong with consequentialist arguments against capital
punishment based on African values?
(A) It is unclear as of yet whether or not capital punishment deters harm to the
community. (B) It is unclear as of yet whether or not capital punishment deters
harm to any individuals. (C) Consequentialism is not supported by African values.
(D) Even though consequentialism is supported by African values, no consequentialist
arguments framed in terms of African values have been offered.
A: Let''s think step by step. We refer to Wikipedia articles on moral disputes for
help. Thaddeus Metz is a humanities research professor focusing on ethical philosophy
in South Africa. Metz has written the death penalty is unjustified as it treats
individuals as incapable of communal relations. It is unclear that capital punishment
is to the benefit of, or a deterrent of harm to the community. The answer is (A).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_moral_disputes
dataset_name: moral_scenarios "dataset_name": "moral_scenarios"
description: 'The following are multiple choice questions (with answers) about moral "description": "The following are multiple choice questions (with answers) about moral\
scenarios. \ scenarios.\n\nQ: For which of these two scenarios does the main character (who\
\ uses I/me/my) do something clearly morally wrong, according to ordinary moral\
\ standards in the US as of 2020? Scenario 1 | I took his wallet after he harassed\
Q: For which of these two scenarios does the main character (who uses I/me/my) do \ me at the bar. Scenario 2 | I made love to my newborn baby boy.\n(A) Wrong, Wrong\
something clearly morally wrong, according to ordinary moral standards in the US \ (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong\nA: Let's think\
as of 2020? Scenario 1 | I took his wallet after he harassed me at the bar. Scenario \ step by step. We refer to Wikipedia articles on moral scenarios for help. Stealing\
2 | I made love to my newborn baby boy. \ from someone is not considered an ethical response to harrassment. And having\
\ sex with a minor is egregiously unethical. The answer is (A).\n\nQ: For which\
(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong \ of these two scenarios does the main character (who uses I/me/my) do something\
\ clearly morally wrong, according to ordinary moral standards in the US as of 2020?\
A: Let''s think step by step. We refer to Wikipedia articles on moral scenarios \ Scenario 1 | I asked for a cigarette from the guy outside the store. Scenario\
for help. Stealing from someone is not considered an ethical response to harrassment. \ 2 | When the security guard left his post, I pocketed a rare coin.\n(A) Wrong,\
And having sex with a minor is egregiously unethical. The answer is (A). \ Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong\nA: Let's\
\ think step by step. We refer to Wikipedia articles on moral scenarios for help.\
\ Asking for something is not morally wrong. Stealing is morally wrong. The answer\
Q: For which of these two scenarios does the main character (who uses I/me/my) do \ is (C).\n\nQ: For which of these two scenarios does the main character (who uses\
something clearly morally wrong, according to ordinary moral standards in the US \ I/me/my) do something clearly morally wrong, according to ordinary moral standards\
as of 2020? Scenario 1 | I asked for a cigarette from the guy outside the store. \ in the US as of 2020? Scenario 1 | During the house's demolition, I pushed him\
Scenario 2 | When the security guard left his post, I pocketed a rare coin. \ into a wall. Scenario 2 | When I choose a babysitter for my twin nieces, I don't\
\ care about her racial background.\n(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not\
(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong \ wrong, Wrong (D) Not wrong, Not wrong\nA: Let's think step by step. We refer to\
\ Wikipedia articles on moral scenarios for help. Physical violence is morally wrong.\
A: Let''s think step by step. We refer to Wikipedia articles on moral scenarios \ Being open to hiring people from different racial backgrounds is an expected moral\
for help. Asking for something is not morally wrong. Stealing is morally wrong. \ stance. The answer is (B).\n\nQ: For which of these two scenarios does the main\
The answer is (C). \ character (who uses I/me/my) do something clearly morally wrong, according to\
\ ordinary moral standards in the US as of 2020? Scenario 1 | I bought an old pick-up\
\ truck for work. Scenario 2 | I looked up at the gorgeous waterfall and threw a\
Q: For which of these two scenarios does the main character (who uses I/me/my) do \ rock at it.\n(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not\
something clearly morally wrong, according to ordinary moral standards in the US \ wrong, Not wrong\nA: Let's think step by step. We refer to Wikipedia articles\
as of 2020? Scenario 1 | During the house''s demolition, I pushed him into a wall. \ on moral scenarios for help. Neither of these are wrong as no one is wronged by\
Scenario 2 | When I choose a babysitter for my twin nieces, I don''t care about \ buing a pick-up truck or throwing a rock into a stream. The answer is (D).\n\n\
her racial background. Q: For which of these two scenarios does the main character (who uses I/me/my) do\
\ something clearly morally wrong, according to ordinary moral standards in the\
(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong \ US as of 2020? Scenario 1 | I really loved my newborn baby boy. Scenario 2 | The\
\ new girl is embarrassed to live in a trailer park, so I invite the class to her\
A: Let''s think step by step. We refer to Wikipedia articles on moral scenarios \ house.\n(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong,\
for help. Physical violence is morally wrong. Being open to hiring people from different \ Not wrong\nA: Let's think step by step. We refer to Wikipedia articles on moral\
racial backgrounds is an expected moral stance. The answer is (B). \ scenarios for help. Loving someone is not wrong. However, exposing something that\
\ someone is embarrassed about could be considered quite mean. The answer is (C)."
"group": "mmlu_flan_cot_fewshot_humanities"
Q: For which of these two scenarios does the main character (who uses I/me/my) do "include": "_mmlu_flan_cot_fewshot_template_yaml"
something clearly morally wrong, according to ordinary moral standards in the US "task": "mmlu_flan_cot_fewshot_moral_scenarios"
as of 2020? Scenario 1 | I bought an old pick-up truck for work. Scenario 2 | I
looked up at the gorgeous waterfall and threw a rock at it.
(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong
A: Let''s think step by step. We refer to Wikipedia articles on moral scenarios
for help. Neither of these are wrong as no one is wronged by buing a pick-up truck
or throwing a rock into a stream. The answer is (D).
Q: For which of these two scenarios does the main character (who uses I/me/my) do
something clearly morally wrong, according to ordinary moral standards in the US
as of 2020? Scenario 1 | I really loved my newborn baby boy. Scenario 2 | The new
girl is embarrassed to live in a trailer park, so I invite the class to her house.
(A) Wrong, Wrong (B) Wrong, Not wrong (C) Not wrong, Wrong (D) Not wrong, Not wrong
A: Let''s think step by step. We refer to Wikipedia articles on moral scenarios
for help. Loving someone is not wrong. However, exposing something that someone
is embarrassed about could be considered quite mean. The answer is (C).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_moral_scenarios
dataset_name: nutrition "dataset_name": "nutrition"
description: 'The following are multiple choice questions (with answers) about nutrition. "description": "The following are multiple choice questions (with answers) about nutrition.\n\
\nQ: What is the first-line drug for patients with type 2 diabetes and obesity,\
\ as of 2020?\n(A) Acarbose (B) Metformin (C) Sulphonylureas (D) Insulin\nA: Let's\
Q: What is the first-line drug for patients with type 2 diabetes and obesity, as \ think step by step. We refer to Wikipedia articles on nutrition for help. Metformin\
of 2020? \ (Fortamet, Glumetza, or others) is usually the first medication prescribed for\
\ type 2 diabetes, as well as obesity. It works by lowering glucose production in\
(A) Acarbose (B) Metformin (C) Sulphonylureas (D) Insulin \ the liver and improving the body's sensitivity to insulin. The answer is (B).\n\
\nQ: Which of the following statements is correct (according to knowledge in 2020)?\n\
A: Let''s think step by step. We refer to Wikipedia articles on nutrition for help. (A) Consumers with phenylketonuria must avoid the consumption of the sweetener aspartame\
Metformin (Fortamet, Glumetza, or others) is usually the first medication prescribed \ (B) Consumers with phenylketonuria must avoid the consumption of the sweetener\
for type 2 diabetes, as well as obesity. It works by lowering glucose production \ saccharin (C) Consumers with phenylketonuria must avoid the consumption of the\
in the liver and improving the body''s sensitivity to insulin. The answer is (B). \ sweetener sucralose (D) Consumers with phenylketonuria must avoid the consumption\
\ of the sweetener acesulfame K\nA: Let's think step by step. We refer to Wikipedia\
\ articles on nutrition for help. People with phenylketonuria (PKU) cannot break\
Q: Which of the following statements is correct (according to knowledge in 2020)? \ down the amino acid phenylalanine. As it builds up in the blood and brain it can\
\ lead to brain damage. People with PKU should avoid foods that are converted to\
(A) Consumers with phenylketonuria must avoid the consumption of the sweetener aspartame \ phenylalanine in the body, such as aspartame. The answer is (A).\n\nQ: Which of\
(B) Consumers with phenylketonuria must avoid the consumption of the sweetener saccharin \ the following statements about iodine is correct, as of 2020?\n(A) 50% of adults\
(C) Consumers with phenylketonuria must avoid the consumption of the sweetener sucralose \ consume iodine at levels below the RNI (B) Dairy products are a poor source of\
(D) Consumers with phenylketonuria must avoid the consumption of the sweetener acesulfame \ iodine (C) The iodine content of organic milk is generally lower that the level\
K \ in non-organic milk (D) UK dietary reference values recommend an increase in iodine\
\ intake in pregnancy\nA: Let's think step by step. We refer to Wikipedia articles\
A: Let''s think step by step. We refer to Wikipedia articles on nutrition for help. \ on nutrition for help. Organic milk usually has less iodine content than non-organic\
People with phenylketonuria (PKU) cannot break down the amino acid phenylalanine. \ milk. The answer is (C).\n\nQ: Which of the following is the most plausible explanation\
As it builds up in the blood and brain it can lead to brain damage. People with \ for the protective effect of dietary fibre against cancer of the colon, as of\
PKU should avoid foods that are converted to phenylalanine in the body, such as \ 2020?\n(A) Propionic acid, formed during colonic fibre fermentation inhibits liver\
aspartame. The answer is (A). \ fatty acid synthesis (B) Butyric acid, formed during colonic fibre fermentation\
\ stimulates \"silencing\" of the SLC5A8 tumour suppressor gene (C) None of these\
\ options are correct (D) Butyric acid, formed during colonic fibre fermentation\
Q: Which of the following statements about iodine is correct, as of 2020? \ stimulates anti-oxidant defences in the colon\nA: Let's think step by step. We\
\ refer to Wikipedia articles on nutrition for help. Dietary fibre is inversely\
(A) 50% of adults consume iodine at levels below the RNI (B) Dairy products are \ proportional to the risk of colorectal cancer. This is presumed because butyric\
a poor source of iodine (C) The iodine content of organic milk is generally lower \ acid (BA) stimulates antioxidants which help protect the colon from cancerous\
that the level in non-organic milk (D) UK dietary reference values recommend an \ tumors. The answer is (D).\n\nQ: In a cohort study, the risk ratio of developing\
increase in iodine intake in pregnancy \ diabetes was 0.86 when comparing consumers of tea (the exposed) to those who did\
\ not drink tea (the unexposed). Which one statement is correct (according to knowledge\
A: Let''s think step by step. We refer to Wikipedia articles on nutrition for help. \ in 2020)?\n(A) The tea drinkers have lower risk of developing diabetes. (B) The\
Organic milk usually has less iodine content than non-organic milk. The answer is \ tea drinkers have higher risk of developing diabetes. (C) Based on the information\
(C). \ given we cannot tell if the observed difference in disease risk is the result\
\ of chance. (D) The risk ratio is close to the value one, so there is no difference\
\ in disease risk between the two groups.\nA: Let's think step by step. We refer\
Q: Which of the following is the most plausible explanation for the protective effect \ to Wikipedia articles on nutrition for help. The risk ratio is not sufficiently\
of dietary fibre against cancer of the colon, as of 2020? \ reduced that it could not be explained by random chance given the studies sample\
\ size. The answer is (C)."
(A) Propionic acid, formed during colonic fibre fermentation inhibits liver fatty "group": "mmlu_flan_cot_fewshot_other"
acid synthesis (B) Butyric acid, formed during colonic fibre fermentation stimulates "include": "_mmlu_flan_cot_fewshot_template_yaml"
"silencing" of the SLC5A8 tumour suppressor gene (C) None of these options are correct "task": "mmlu_flan_cot_fewshot_nutrition"
(D) Butyric acid, formed during colonic fibre fermentation stimulates anti-oxidant
defences in the colon
A: Let''s think step by step. We refer to Wikipedia articles on nutrition for help.
Dietary fibre is inversely proportional to the risk of colorectal cancer. This is
presumed because butyric acid (BA) stimulates antioxidants which help protect the
colon from cancerous tumors. The answer is (D).
Q: In a cohort study, the risk ratio of developing diabetes was 0.86 when comparing
consumers of tea (the exposed) to those who did not drink tea (the unexposed). Which
one statement is correct (according to knowledge in 2020)?
(A) The tea drinkers have lower risk of developing diabetes. (B) The tea drinkers
have higher risk of developing diabetes. (C) Based on the information given we cannot
tell if the observed difference in disease risk is the result of chance. (D) The
risk ratio is close to the value one, so there is no difference in disease risk
between the two groups.
A: Let''s think step by step. We refer to Wikipedia articles on nutrition for help.
The risk ratio is not sufficiently reduced that it could not be explained by random
chance given the studies sample size. The answer is (C).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_nutrition
dataset_name: philosophy "dataset_name": "philosophy"
description: "The following are multiple choice questions (with answers) about philosophy.\n\ "description": "The following are multiple choice questions (with answers) about philosophy.\n\
\nQ: The study of reality in the broadest sense, an inquiry into the elemental nature\ \nQ: The study of reality in the broadest sense, an inquiry into the elemental nature\
\ of the universe and the things in it, is known as _____.\n(A) metaphysics (B)\ \ of the universe and the things in it, is known as _____.\n(A) metaphysics (B)\
\ epistemology (C) quantum physics (D) axiology\nA: Let's think step by step. We\ \ epistemology (C) quantum physics (D) axiology\nA: Let's think step by step. We\
\ refer to Wikipedia articles on philosophy for help. Among the options, only metaphysics\ \ refer to Wikipedia articles on philosophy for help. Among the options, only metaphysics\
\ studies the nature of reality and existence. The answer is (A).\n\nQ: According\ \ studies the nature of reality and existence. The answer is (A).\n\nQ: According\
\ to Moore\u2019s \u201Cideal utilitarianism,\u201D the right action is the one\ \ to Moore’s ideal utilitarianism, the right action is the one that brings about\
\ that brings about the greatest amount of:\n(A) pleasure. (B) happiness. (C) good.\ \ the greatest amount of:\n(A) pleasure. (B) happiness. (C) good. (D) virtue.\n\
\ (D) virtue.\nA: Let's think step by step. We refer to Wikipedia articles on philosophy\ A: Let's think step by step. We refer to Wikipedia articles on philosophy for help.\
\ for help. Moore's \"ideal utilitarianism\" states that one's actions should maximize\ \ Moore's \"ideal utilitarianism\" states that one's actions should maximize intrinsic\
\ intrinsic goods. The answer is (C).\n\nQ: Before Tolstoy's Christian conversion,\ \ goods. The answer is (C).\n\nQ: Before Tolstoy's Christian conversion, what was\
\ what was his perspective on the meaning of life?\n(A) optimist (B) satisfied (C)\ \ his perspective on the meaning of life?\n(A) optimist (B) satisfied (C) nominally\
\ nominally religious (D) pessimist\nA: Let's think step by step. We refer to Wikipedia\ \ religious (D) pessimist\nA: Let's think step by step. We refer to Wikipedia articles\
\ articles on philosophy for help. Before his conversion, Tolstoy feels that life\ \ on philosophy for help. Before his conversion, Tolstoy feels that life was uncertain,\
\ was uncertain, which is a pessimist's point of view. The answer is (D).\n\nQ:\ \ which is a pessimist's point of view. The answer is (D).\n\nQ: According to d'Holbach,\
\ According to d'Holbach, people always act according to _____.\n(A) free choices\ \ people always act according to _____.\n(A) free choices (B) dictates of the soul\
\ (B) dictates of the soul (C) necessary natural laws (D) undetermined will\nA:\ \ (C) necessary natural laws (D) undetermined will\nA: Let's think step by step.\
\ Let's think step by step. We refer to Wikipedia articles on philosophy for help.\ \ We refer to Wikipedia articles on philosophy for help. d'Holbach believes that\
\ d'Holbach believes that people act according to necessary laws, and it proves\ \ people act according to necessary laws, and it proves nothing about people's free\
\ nothing about people's free will. The answer is (C).\n\nQ: Psychological egoism\ \ will. The answer is (C).\n\nQ: Psychological egoism is:\n(A) an ethical theory\
\ is:\n(A) an ethical theory about how we ought to behave. (B) a generalization\ \ about how we ought to behave. (B) a generalization concerning the way people tend\
\ concerning the way people tend to behave. (C) a claim about human nature and the\ \ to behave. (C) a claim about human nature and the ways people are capable of behaving.\
\ ways people are capable of behaving. (D) none of the above.\nA: Let's think step\ \ (D) none of the above.\nA: Let's think step by step. We refer to Wikipedia articles\
\ by step. We refer to Wikipedia articles on philosophy for help. Psychological\ \ on philosophy for help. Psychological egoism suggests that one behaves based on\
\ egoism suggests that one behaves based on what makes one feels good, hence it\ \ what makes one feels good, hence it is a claim about human nature and how humans\
\ is a claim about human nature and how humans are capable of behaving. The answer\ \ are capable of behaving. The answer is (C)."
\ is (C)." "group": "mmlu_flan_cot_fewshot_humanities"
include: _mmlu_flan_cot_fewshot_template_yaml "include": "_mmlu_flan_cot_fewshot_template_yaml"
task: mmlu_flan_cot_fewshot_philosophy "task": "mmlu_flan_cot_fewshot_philosophy"
dataset_name: prehistory "dataset_name": "prehistory"
description: 'The following are multiple choice questions (with answers) about prehistory. "description": "The following are multiple choice questions (with answers) about prehistory.\n\
\nQ: What is the approximate mean cranial capacity of Homo erectus?\n(A) under 650\
\ cc (B) about 800 cc (C) just under 1000 cc (D) 1200 cc\nA: Let's think step by\
Q: What is the approximate mean cranial capacity of Homo erectus? \ step. We refer to Wikipedia articles on prehistory for help. The average cranium\
\ capacity of Homo erectus is less than 1000 cubic cm. The answer is (C).\n\nQ:\
(A) under 650 cc (B) about 800 cc (C) just under 1000 cc (D) 1200 cc \ According to Timothy Pauketat, the evidence for social stratification and political\
\ power at Cahokia suggests:\n(A) a center of Mississippian civilization with conditions\
A: Let''s think step by step. We refer to Wikipedia articles on prehistory for help. \ similar to the rise of early states. (B) the limitations of authority in a Native\
The average cranium capacity of Homo erectus is less than 1000 cubic cm. The answer \ American society of egalitarian foragers. (C) a simple chiefdom or perhaps a complex\
is (C). \ chiefdom had evolved by A.D. 1500. (D) a center of Mississippian civilization\
\ with conditions similar to societies on the Northwest Coast of North America.\n\
A: Let's think step by step. We refer to Wikipedia articles on prehistory for help.\
Q: According to Timothy Pauketat, the evidence for social stratification and political \ Timothy Pauketat is known for his research on Cahokia, the center of the Mississippian\
power at Cahokia suggests: \ culture, where he found similar conditions to the rise of early states. The answer\
\ is (A).\n\nQ: Recent research on hominid species dating from the Middle Pliocene\
(A) a center of Mississippian civilization with conditions similar to the rise of \ indicates there was (as of 2020):\n(A) a great amount of species diversity, or\
early states. (B) the limitations of authority in a Native American society of egalitarian \ a single species that exhibited a lot of diversity. (B) very little species diversity\
foragers. (C) a simple chiefdom or perhaps a complex chiefdom had evolved by A.D. \ during this period and very few hominids. (C) decreased species diversity due\
1500. (D) a center of Mississippian civilization with conditions similar to societies \ to a prolonged ice age followed by a severe drought. (D) decreased species diversity\
on the Northwest Coast of North America. \ but increased numbers of hammerstones and flakes, indicating stone tool manufacture.\n\
A: Let's think step by step. We refer to Wikipedia articles on prehistory for help.\
A: Let''s think step by step. We refer to Wikipedia articles on prehistory for help. \ Recent research has recognized multiple hominid species from the Middle Pliocene,\
Timothy Pauketat is known for his research on Cahokia, the center of the Mississippian \ meaning that there is a great amount of species diversity or diversity in a single\
culture, where he found similar conditions to the rise of early states. The answer \ species. The answer is (A).\n\nQ: Researchers now believe that the decline of\
is (A). \ the Maya was caused chiefly by:\n(A) a cataclysm of some kind, such as an earthquake,\
\ volcano, or tsunami. (B) ecological degradation resulting from slash-and-burn\
\ farming techniques. (C) endless wars between neighboring Mayan city-states. (D)\
Q: Recent research on hominid species dating from the Middle Pliocene indicates \ practices of interbreeding that led to a steep rise in congenital disorders.\n\
there was (as of 2020): A: Let's think step by step. We refer to Wikipedia articles on prehistory for help.\
\ Researchers believe that the Maya collapse was mainly caused by over-exploitation\
(A) a great amount of species diversity, or a single species that exhibited a lot \ of natural resources like the slash-and-burn farming techniques. The answer is\
of diversity. (B) very little species diversity during this period and very few \ (B).\n\nQ: The great Mayan king Pacal built temples in the city of Palenque in\
hominids. (C) decreased species diversity due to a prolonged ice age followed by \ order to:\n(A) satisfy the powerful Mayan astronomer priests. (B) display his\
a severe drought. (D) decreased species diversity but increased numbers of hammerstones \ generosity to the common people, since they were allowed to live in the temples.\
and flakes, indicating stone tool manufacture. \ (C) frighten away enemies, in particular the Spaniards. (D) legitimize his kingship,\
\ since his father was not royal.\nA: Let's think step by step. We refer to Wikipedia\
A: Let''s think step by step. We refer to Wikipedia articles on prehistory for help. \ articles on prehistory for help. Pacal built the temples as the funerary monument\
Recent research has recognized multiple hominid species from the Middle Pliocene, \ to legitimize his kingship. The answer is (D)."
meaning that there is a great amount of species diversity or diversity in a single "group": "mmlu_flan_cot_fewshot_humanities"
species. The answer is (A). "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_prehistory"
Q: Researchers now believe that the decline of the Maya was caused chiefly by:
(A) a cataclysm of some kind, such as an earthquake, volcano, or tsunami. (B) ecological
degradation resulting from slash-and-burn farming techniques. (C) endless wars between
neighboring Mayan city-states. (D) practices of interbreeding that led to a steep
rise in congenital disorders.
A: Let''s think step by step. We refer to Wikipedia articles on prehistory for help.
Researchers believe that the Maya collapse was mainly caused by over-exploitation
of natural resources like the slash-and-burn farming techniques. The answer is (B).
Q: The great Mayan king Pacal built temples in the city of Palenque in order to:
(A) satisfy the powerful Mayan astronomer priests. (B) display his generosity to
the common people, since they were allowed to live in the temples. (C) frighten
away enemies, in particular the Spaniards. (D) legitimize his kingship, since his
father was not royal.
A: Let''s think step by step. We refer to Wikipedia articles on prehistory for help.
Pacal built the temples as the funerary monument to legitimize his kingship. The
answer is (D).'
include: _mmlu_flan_cot_fewshot_template_yaml
task: mmlu_flan_cot_fewshot_prehistory
dataset_name: professional_accounting "dataset_name": "professional_accounting"
description: "The following are multiple choice questions (with answers) about professional\ "description": "The following are multiple choice questions (with answers) about professional\
\ accounting.\n\nQ: An auditor traces the serial numbers on equipment to a nonissuer\u2019\ \ accounting.\n\nQ: An auditor traces the serial numbers on equipment to a nonissuer’s\
s subledger. Which of the following management assertions is supported by this test?\n\ \ subledger. Which of the following management assertions is supported by this test?\n\
(A) Valuation and allocation (B) Completeness (C) Rights and obligations (D) Presentation\ (A) Valuation and allocation (B) Completeness (C) Rights and obligations (D) Presentation\
\ and disclosure\nA: Let's think step by step. We refer to Wikipedia articles on\ \ and disclosure\nA: Let's think step by step. We refer to Wikipedia articles on\
\ accounting for help. The completeness assertion is tested by tracing supporting\ \ accounting for help. The completeness assertion is tested by tracing supporting\
...@@ -43,5 +43,6 @@ description: "The following are multiple choice questions (with answers) about p ...@@ -43,5 +43,6 @@ description: "The following are multiple choice questions (with answers) about p
\ transactions, only Proceeds from long-term debt belongs to the financing activities\ \ transactions, only Proceeds from long-term debt belongs to the financing activities\
\ section of cashflow, hence the amount reported should be $100000. The answer is\ \ section of cashflow, hence the amount reported should be $100000. The answer is\
\ (D)." \ (D)."
include: _mmlu_flan_cot_fewshot_template_yaml "group": "mmlu_flan_cot_fewshot_other"
task: mmlu_flan_cot_fewshot_professional_accounting "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_professional_accounting"
dataset_name: professional_law "dataset_name": "professional_law"
description: "The following are multiple choice questions (with answers) about professional\ "description": "The following are multiple choice questions (with answers) about professional\
\ law.\n\nQ: A son owed a creditor $5,000. The son's father contacted the creditor\ \ law.\n\nQ: A son owed a creditor $5,000. The son's father contacted the creditor\
\ and told him that he wanted to pay the son's debt. The father signed a document\ \ and told him that he wanted to pay the son's debt. The father signed a document\
\ that stated the father would pay the son's debt at a rate of $500 a month for\ \ that stated the father would pay the son's debt at a rate of $500 a month for\
...@@ -101,5 +101,6 @@ description: "The following are multiple choice questions (with answers) about p ...@@ -101,5 +101,6 @@ description: "The following are multiple choice questions (with answers) about p
\ Amendment further supports the First Amendment by establishing a due process clause.\ \ Amendment further supports the First Amendment by establishing a due process clause.\
\ Hence the strongest argument should be the statute is overbroad and consequently\ \ Hence the strongest argument should be the statute is overbroad and consequently\
\ invalid under the First and Fourteenth Amendments. The answer is (D)." \ invalid under the First and Fourteenth Amendments. The answer is (D)."
include: _mmlu_flan_cot_fewshot_template_yaml "group": "mmlu_flan_cot_fewshot_humanities"
task: mmlu_flan_cot_fewshot_professional_law "include": "_mmlu_flan_cot_fewshot_template_yaml"
"task": "mmlu_flan_cot_fewshot_professional_law"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment