Commit f9cc0267 authored by Leo Gao's avatar Leo Gao
Browse files

Use hashed version stability test instead

parent 10d4b64a
{"results": {"hendrycksTest-computer_security": {"acc": 0.2, "acc_stderr": 0.13333333333333333, "acc_norm": 0.2, "acc_norm_stderr": 0.13333333333333333}}, "versions": {"hendrycksTest-computer_security": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-computer_security": {"acc": 0.24, "acc_norm": 0.27, "acc_norm_stderr": 0.044619604333847394, "acc_stderr": 0.042923469599092816}}, "versions": {"hendrycksTest-computer_security": 0}}
\ No newline at end of file
622f191ccfc7a597d99f39897ebe3f95a9ddce0e662fcfb411aa554b289bb355
\ No newline at end of file
[["Question: The complementary color of blue is\nChoices:\nA. magenta\nB. yellow\nC. cyan\nD. white\nAnswer:", " magenta"], ["Question: The complementary color of blue is\nChoices:\nA. magenta\nB. yellow\nC. cyan\nD. white\nAnswer:", " yellow"], ["Question: The complementary color of blue is\nChoices:\nA. magenta\nB. yellow\nC. cyan\nD. white\nAnswer:", " cyan"], ["Question: The complementary color of blue is\nChoices:\nA. magenta\nB. yellow\nC. cyan\nD. white\nAnswer:", " white"], ["Question: Which of these temperatures is likely when a container of water at 20\u00b0C is mixed with water at 28\u00b0C?\nChoices:\nA. 19\u00b0C\nB. 22\u00b0C\nC. 30\u00b0C\nD. Higher than 30\u00b0C\nAnswer:", " 19\u00b0C"], ["Question: Which of these temperatures is likely when a container of water at 20\u00b0C is mixed with water at 28\u00b0C?\nChoices:\nA. 19\u00b0C\nB. 22\u00b0C\nC. 30\u00b0C\nD. Higher than 30\u00b0C\nAnswer:", " 22\u00b0C"], ["Question: Which of these temperatures is likely when a container of water at 20\u00b0C is mixed with water at 28\u00b0C?\nChoices:\nA. 19\u00b0C\nB. 22\u00b0C\nC. 30\u00b0C\nD. Higher than 30\u00b0C\nAnswer:", " 30\u00b0C"], ["Question: Which of these temperatures is likely when a container of water at 20\u00b0C is mixed with water at 28\u00b0C?\nChoices:\nA. 19\u00b0C\nB. 22\u00b0C\nC. 30\u00b0C\nD. Higher than 30\u00b0C\nAnswer:", " Higher than 30\u00b0C"], ["Question: A wave transfers\nChoices:\nA. amplitude\nB. wavelength\nC. frequency\nD. energy\nAnswer:", " amplitude"], ["Question: A wave transfers\nChoices:\nA. amplitude\nB. wavelength\nC. frequency\nD. energy\nAnswer:", " wavelength"], ["Question: A wave transfers\nChoices:\nA. amplitude\nB. wavelength\nC. frequency\nD. energy\nAnswer:", " frequency"], ["Question: A wave transfers\nChoices:\nA. amplitude\nB. wavelength\nC. frequency\nD. energy\nAnswer:", " energy"], ["Question: The blueness of the daytime sky is due mostly to light\nChoices:\nA. absorption\nB. transmission\nC. reflection\nD. scattering\nAnswer:", " absorption"], ["Question: The blueness of the daytime sky is due mostly to light\nChoices:\nA. absorption\nB. transmission\nC. reflection\nD. scattering\nAnswer:", " transmission"], ["Question: The blueness of the daytime sky is due mostly to light\nChoices:\nA. absorption\nB. transmission\nC. reflection\nD. scattering\nAnswer:", " reflection"], ["Question: The blueness of the daytime sky is due mostly to light\nChoices:\nA. absorption\nB. transmission\nC. reflection\nD. scattering\nAnswer:", " scattering"], ["Question: The greenish-blue color of ocean water is due mostly to light that hasn\u2019t been\nChoices:\nA. absorbed\nB. reflected\nC. scattered\nD. refracted\nAnswer:", " absorbed"], ["Question: The greenish-blue color of ocean water is due mostly to light that hasn\u2019t been\nChoices:\nA. absorbed\nB. reflected\nC. scattered\nD. refracted\nAnswer:", " reflected"], ["Question: The greenish-blue color of ocean water is due mostly to light that hasn\u2019t been\nChoices:\nA. absorbed\nB. reflected\nC. scattered\nD. refracted\nAnswer:", " scattered"], ["Question: The greenish-blue color of ocean water is due mostly to light that hasn\u2019t been\nChoices:\nA. absorbed\nB. reflected\nC. scattered\nD. refracted\nAnswer:", " refracted"], ["Question: An electron can be speeded up by\nChoices:\nA. an electric field\nB. a magnetic field\nC. Both of these\nD. Neither of these\nAnswer:", " an electric field"], ["Question: An electron can be speeded up by\nChoices:\nA. an electric field\nB. a magnetic field\nC. Both of these\nD. Neither of these\nAnswer:", " a magnetic field"], ["Question: An electron can be speeded up by\nChoices:\nA. an electric field\nB. a magnetic field\nC. Both of these\nD. Neither of these\nAnswer:", " Both of these"], ["Question: An electron can be speeded up by\nChoices:\nA. an electric field\nB. a magnetic field\nC. Both of these\nD. Neither of these\nAnswer:", " Neither of these"], ["Question: When you scale up an object to 3 times its linear size, the surface area increases by\nChoices:\nA. 3 and the volume by 9.\nB. 3 and the volume by 27.\nC. 9 and the volume by 27.\nD. 4 and the volume by 8.\nAnswer:", " 3 and the volume by 9."], ["Question: When you scale up an object to 3 times its linear size, the surface area increases by\nChoices:\nA. 3 and the volume by 9.\nB. 3 and the volume by 27.\nC. 9 and the volume by 27.\nD. 4 and the volume by 8.\nAnswer:", " 3 and the volume by 27."], ["Question: When you scale up an object to 3 times its linear size, the surface area increases by\nChoices:\nA. 3 and the volume by 9.\nB. 3 and the volume by 27.\nC. 9 and the volume by 27.\nD. 4 and the volume by 8.\nAnswer:", " 9 and the volume by 27."], ["Question: When you scale up an object to 3 times its linear size, the surface area increases by\nChoices:\nA. 3 and the volume by 9.\nB. 3 and the volume by 27.\nC. 9 and the volume by 27.\nD. 4 and the volume by 8.\nAnswer:", " 4 and the volume by 8."], ["Question: For a tube closed at one end, the length of the tube at a frequency of 256 Hz will be\nChoices:\nA. one-quarter the value for a tube open at both ends\nB. one-half the value for a tube open at both ends\nC. twice the value for a tube open at both ends\nD. four times the value for a tube open at both ends\nAnswer:", " one-quarter the value for a tube open at both ends"], ["Question: For a tube closed at one end, the length of the tube at a frequency of 256 Hz will be\nChoices:\nA. one-quarter the value for a tube open at both ends\nB. one-half the value for a tube open at both ends\nC. twice the value for a tube open at both ends\nD. four times the value for a tube open at both ends\nAnswer:", " one-half the value for a tube open at both ends"], ["Question: For a tube closed at one end, the length of the tube at a frequency of 256 Hz will be\nChoices:\nA. one-quarter the value for a tube open at both ends\nB. one-half the value for a tube open at both ends\nC. twice the value for a tube open at both ends\nD. four times the value for a tube open at both ends\nAnswer:", " twice the value for a tube open at both ends"], ["Question: For a tube closed at one end, the length of the tube at a frequency of 256 Hz will be\nChoices:\nA. one-quarter the value for a tube open at both ends\nB. one-half the value for a tube open at both ends\nC. twice the value for a tube open at both ends\nD. four times the value for a tube open at both ends\nAnswer:", " four times the value for a tube open at both ends"], ["Question: To receive an electric shock there must be a\nChoices:\nA. current in one direction.\nB. presence of moisture.\nC. high voltage and low body resistance.\nD. voltage difference across part or all of your body.\nAnswer:", " current in one direction."], ["Question: To receive an electric shock there must be a\nChoices:\nA. current in one direction.\nB. presence of moisture.\nC. high voltage and low body resistance.\nD. voltage difference across part or all of your body.\nAnswer:", " presence of moisture."], ["Question: To receive an electric shock there must be a\nChoices:\nA. current in one direction.\nB. presence of moisture.\nC. high voltage and low body resistance.\nD. voltage difference across part or all of your body.\nAnswer:", " high voltage and low body resistance."], ["Question: To receive an electric shock there must be a\nChoices:\nA. current in one direction.\nB. presence of moisture.\nC. high voltage and low body resistance.\nD. voltage difference across part or all of your body.\nAnswer:", " voltage difference across part or all of your body."], ["Question: A certain element emits 1 alpha particle, and its products then emit 2 beta particles in succession. The atomic number of the resulting element is changed by\nChoices:\nA. zero\nB. minus 1\nC. minus 2\nD. plus 1\nAnswer:", " zero"], ["Question: A certain element emits 1 alpha particle, and its products then emit 2 beta particles in succession. The atomic number of the resulting element is changed by\nChoices:\nA. zero\nB. minus 1\nC. minus 2\nD. plus 1\nAnswer:", " minus 1"], ["Question: A certain element emits 1 alpha particle, and its products then emit 2 beta particles in succession. The atomic number of the resulting element is changed by\nChoices:\nA. zero\nB. minus 1\nC. minus 2\nD. plus 1\nAnswer:", " minus 2"], ["Question: A certain element emits 1 alpha particle, and its products then emit 2 beta particles in succession. The atomic number of the resulting element is changed by\nChoices:\nA. zero\nB. minus 1\nC. minus 2\nD. plus 1\nAnswer:", " plus 1"]]
\ No newline at end of file
{"results": {"hendrycksTest-conceptual_physics": {"acc": 0.3, "acc_stderr": 0.15275252316519464, "acc_norm": 0.3, "acc_norm_stderr": 0.15275252316519464}}, "versions": {"hendrycksTest-conceptual_physics": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-conceptual_physics": {"acc": 0.2680851063829787, "acc_norm": 0.2553191489361702, "acc_norm_stderr": 0.028504856470514185, "acc_stderr": 0.028957342788342347}}, "versions": {"hendrycksTest-conceptual_physics": 0}}
\ No newline at end of file
cde76ba2c7382b4876e17136c94f52aca2774e50342ab757b2a2d18da370dcb6
\ No newline at end of file
[["Question: A white noise process will have\n\n(i) A zero mean\n\n(ii) A constant variance\n\n(iii) Autocovariances that are constant\n\n(iv) Autocovariances that are zero except at lag zero\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (ii) and (iv) only"], ["Question: A white noise process will have\n\n(i) A zero mean\n\n(ii) A constant variance\n\n(iii) Autocovariances that are constant\n\n(iv) Autocovariances that are zero except at lag zero\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i) and (iii) only"], ["Question: A white noise process will have\n\n(i) A zero mean\n\n(ii) A constant variance\n\n(iii) Autocovariances that are constant\n\n(iv) Autocovariances that are zero except at lag zero\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), and (iii) only"], ["Question: A white noise process will have\n\n(i) A zero mean\n\n(ii) A constant variance\n\n(iii) Autocovariances that are constant\n\n(iv) Autocovariances that are zero except at lag zero\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), (iii), and (iv)"], ["Question: How many parameters will be required to be estimated in total for all equations of a standard form, unrestricted, tri-variate VAR(4), ignoring the intercepts?\nChoices:\nA. 12\nB. 4\nC. 3\nD. 36\nAnswer:", " 12"], ["Question: How many parameters will be required to be estimated in total for all equations of a standard form, unrestricted, tri-variate VAR(4), ignoring the intercepts?\nChoices:\nA. 12\nB. 4\nC. 3\nD. 36\nAnswer:", " 4"], ["Question: How many parameters will be required to be estimated in total for all equations of a standard form, unrestricted, tri-variate VAR(4), ignoring the intercepts?\nChoices:\nA. 12\nB. 4\nC. 3\nD. 36\nAnswer:", " 3"], ["Question: How many parameters will be required to be estimated in total for all equations of a standard form, unrestricted, tri-variate VAR(4), ignoring the intercepts?\nChoices:\nA. 12\nB. 4\nC. 3\nD. 36\nAnswer:", " 36"], ["Question: Which of the following are plausible approaches to dealing with residual autocorrelation?\n\ni) Take logarithms of each of the variables\n\nii) Add lagged values of the variables to the regression equation\n\niii) Use dummy variables to remove outlying observations\n\niv) Try a model in first differenced form rather than in levels.\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (ii) and (iv) only"], ["Question: Which of the following are plausible approaches to dealing with residual autocorrelation?\n\ni) Take logarithms of each of the variables\n\nii) Add lagged values of the variables to the regression equation\n\niii) Use dummy variables to remove outlying observations\n\niv) Try a model in first differenced form rather than in levels.\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i) and (iii) only"], ["Question: Which of the following are plausible approaches to dealing with residual autocorrelation?\n\ni) Take logarithms of each of the variables\n\nii) Add lagged values of the variables to the regression equation\n\niii) Use dummy variables to remove outlying observations\n\niv) Try a model in first differenced form rather than in levels.\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), and (iii) only"], ["Question: Which of the following are plausible approaches to dealing with residual autocorrelation?\n\ni) Take logarithms of each of the variables\n\nii) Add lagged values of the variables to the regression equation\n\niii) Use dummy variables to remove outlying observations\n\niv) Try a model in first differenced form rather than in levels.\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), (iii), and (iv)"], ["Question: Which one of the following is NOT an example of mis-specification of functional form?\nChoices:\nA. Using a linear specification when y scales as a function of the squares of x\nB. Using a linear specification when a double-logarithmic model would be more appropriate\nC. Modelling y as a function of x when in fact it scales as a function of 1/x\nD. Excluding a relevant variable from a linear regression model\nAnswer:", " Using a linear specification when y scales as a function of the squares of x"], ["Question: Which one of the following is NOT an example of mis-specification of functional form?\nChoices:\nA. Using a linear specification when y scales as a function of the squares of x\nB. Using a linear specification when a double-logarithmic model would be more appropriate\nC. Modelling y as a function of x when in fact it scales as a function of 1/x\nD. Excluding a relevant variable from a linear regression model\nAnswer:", " Using a linear specification when a double-logarithmic model would be more appropriate"], ["Question: Which one of the following is NOT an example of mis-specification of functional form?\nChoices:\nA. Using a linear specification when y scales as a function of the squares of x\nB. Using a linear specification when a double-logarithmic model would be more appropriate\nC. Modelling y as a function of x when in fact it scales as a function of 1/x\nD. Excluding a relevant variable from a linear regression model\nAnswer:", " Modelling y as a function of x when in fact it scales as a function of 1/x"], ["Question: Which one of the following is NOT an example of mis-specification of functional form?\nChoices:\nA. Using a linear specification when y scales as a function of the squares of x\nB. Using a linear specification when a double-logarithmic model would be more appropriate\nC. Modelling y as a function of x when in fact it scales as a function of 1/x\nD. Excluding a relevant variable from a linear regression model\nAnswer:", " Excluding a relevant variable from a linear regression model"], ["Question: The order condition is\nChoices:\nA. A necessary and sufficient condition for identification\nB. A necessary but not sufficient condition for identification\nC. A sufficient but not necessary condition for identification\nD. A condition that is nether necessary nor sufficient for identification\nAnswer:", " A necessary and sufficient condition for identification"], ["Question: The order condition is\nChoices:\nA. A necessary and sufficient condition for identification\nB. A necessary but not sufficient condition for identification\nC. A sufficient but not necessary condition for identification\nD. A condition that is nether necessary nor sufficient for identification\nAnswer:", " A necessary but not sufficient condition for identification"], ["Question: The order condition is\nChoices:\nA. A necessary and sufficient condition for identification\nB. A necessary but not sufficient condition for identification\nC. A sufficient but not necessary condition for identification\nD. A condition that is nether necessary nor sufficient for identification\nAnswer:", " A sufficient but not necessary condition for identification"], ["Question: The order condition is\nChoices:\nA. A necessary and sufficient condition for identification\nB. A necessary but not sufficient condition for identification\nC. A sufficient but not necessary condition for identification\nD. A condition that is nether necessary nor sufficient for identification\nAnswer:", " A condition that is nether necessary nor sufficient for identification"], ["Question: Which of the following is a disadvantage of the fixed effects approach to estimating a panel model?\nChoices:\nA. The model is likely to be technical to estimate\nB. The approach may not be valid if the composite error term is correlated with one or more of the explanatory variables\nC. The number of parameters to estimate may be large, resulting in a loss of degrees of freedom\nD. The fixed effects approach can only capture cross-sectional heterogeneity and not temporal variation in the dependent variable.\nAnswer:", " The model is likely to be technical to estimate"], ["Question: Which of the following is a disadvantage of the fixed effects approach to estimating a panel model?\nChoices:\nA. The model is likely to be technical to estimate\nB. The approach may not be valid if the composite error term is correlated with one or more of the explanatory variables\nC. The number of parameters to estimate may be large, resulting in a loss of degrees of freedom\nD. The fixed effects approach can only capture cross-sectional heterogeneity and not temporal variation in the dependent variable.\nAnswer:", " The approach may not be valid if the composite error term is correlated with one or more of the explanatory variables"], ["Question: Which of the following is a disadvantage of the fixed effects approach to estimating a panel model?\nChoices:\nA. The model is likely to be technical to estimate\nB. The approach may not be valid if the composite error term is correlated with one or more of the explanatory variables\nC. The number of parameters to estimate may be large, resulting in a loss of degrees of freedom\nD. The fixed effects approach can only capture cross-sectional heterogeneity and not temporal variation in the dependent variable.\nAnswer:", " The number of parameters to estimate may be large, resulting in a loss of degrees of freedom"], ["Question: Which of the following is a disadvantage of the fixed effects approach to estimating a panel model?\nChoices:\nA. The model is likely to be technical to estimate\nB. The approach may not be valid if the composite error term is correlated with one or more of the explanatory variables\nC. The number of parameters to estimate may be large, resulting in a loss of degrees of freedom\nD. The fixed effects approach can only capture cross-sectional heterogeneity and not temporal variation in the dependent variable.\nAnswer:", " The fixed effects approach can only capture cross-sectional heterogeneity and not temporal variation in the dependent variable."], ["Question: Which of the following statements are true concerning the standardised residuals (residuals divided by their respective conditional standard deviations) from an estimated GARCH model?\n\ni) They are assumed to be normally distributed\n\n\nii) Their squares will be related to their lagged squared values if the GARCH model is\n\nappropriate\n\n\niii) In practice, they are likely to have fat tails\n\n\niv) If the GARCH model is adequate, the standardised residuals and the raw residuals\n\nwill be identical\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (ii) and (iv) only"], ["Question: Which of the following statements are true concerning the standardised residuals (residuals divided by their respective conditional standard deviations) from an estimated GARCH model?\n\ni) They are assumed to be normally distributed\n\n\nii) Their squares will be related to their lagged squared values if the GARCH model is\n\nappropriate\n\n\niii) In practice, they are likely to have fat tails\n\n\niv) If the GARCH model is adequate, the standardised residuals and the raw residuals\n\nwill be identical\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i) and (iii) only"], ["Question: Which of the following statements are true concerning the standardised residuals (residuals divided by their respective conditional standard deviations) from an estimated GARCH model?\n\ni) They are assumed to be normally distributed\n\n\nii) Their squares will be related to their lagged squared values if the GARCH model is\n\nappropriate\n\n\niii) In practice, they are likely to have fat tails\n\n\niv) If the GARCH model is adequate, the standardised residuals and the raw residuals\n\nwill be identical\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), and (iii) only"], ["Question: Which of the following statements are true concerning the standardised residuals (residuals divided by their respective conditional standard deviations) from an estimated GARCH model?\n\ni) They are assumed to be normally distributed\n\n\nii) Their squares will be related to their lagged squared values if the GARCH model is\n\nappropriate\n\n\niii) In practice, they are likely to have fat tails\n\n\niv) If the GARCH model is adequate, the standardised residuals and the raw residuals\n\nwill be identical\nChoices:\nA. (ii) and (iv) only\nB. (i) and (iii) only\nC. (i), (ii), and (iii) only\nD. (i), (ii), (iii), and (iv)\nAnswer:", " (i), (ii), (iii), and (iv)"], ["Question: Suppose that the Durbin Watson test is applied to a regression containing two explanatory variables plus a constant with 50 data points. The test statistic takes a value of 1.53. What is the appropriate conclusion?\nChoices:\nA. Residuals appear to be positively autocorrelated\nB. Residuals appear to be negatively autocorrelated\nC. Residuals appear not to be autocorrelated\nD. The test result is inconclusive\nAnswer:", " Residuals appear to be positively autocorrelated"], ["Question: Suppose that the Durbin Watson test is applied to a regression containing two explanatory variables plus a constant with 50 data points. The test statistic takes a value of 1.53. What is the appropriate conclusion?\nChoices:\nA. Residuals appear to be positively autocorrelated\nB. Residuals appear to be negatively autocorrelated\nC. Residuals appear not to be autocorrelated\nD. The test result is inconclusive\nAnswer:", " Residuals appear to be negatively autocorrelated"], ["Question: Suppose that the Durbin Watson test is applied to a regression containing two explanatory variables plus a constant with 50 data points. The test statistic takes a value of 1.53. What is the appropriate conclusion?\nChoices:\nA. Residuals appear to be positively autocorrelated\nB. Residuals appear to be negatively autocorrelated\nC. Residuals appear not to be autocorrelated\nD. The test result is inconclusive\nAnswer:", " Residuals appear not to be autocorrelated"], ["Question: Suppose that the Durbin Watson test is applied to a regression containing two explanatory variables plus a constant with 50 data points. The test statistic takes a value of 1.53. What is the appropriate conclusion?\nChoices:\nA. Residuals appear to be positively autocorrelated\nB. Residuals appear to be negatively autocorrelated\nC. Residuals appear not to be autocorrelated\nD. The test result is inconclusive\nAnswer:", " The test result is inconclusive"], ["Question: Suppose that we wished to evaluate the factors that affected the probability that an investor would choose an equity fund rather than a bond fund or a cash investment. Which class of model would be most appropriate?\nChoices:\nA. A logit model\nB. A multinomial logit\nC. A tobit model\nD. An ordered logit model\nAnswer:", " A logit model"], ["Question: Suppose that we wished to evaluate the factors that affected the probability that an investor would choose an equity fund rather than a bond fund or a cash investment. Which class of model would be most appropriate?\nChoices:\nA. A logit model\nB. A multinomial logit\nC. A tobit model\nD. An ordered logit model\nAnswer:", " A multinomial logit"], ["Question: Suppose that we wished to evaluate the factors that affected the probability that an investor would choose an equity fund rather than a bond fund or a cash investment. Which class of model would be most appropriate?\nChoices:\nA. A logit model\nB. A multinomial logit\nC. A tobit model\nD. An ordered logit model\nAnswer:", " A tobit model"], ["Question: Suppose that we wished to evaluate the factors that affected the probability that an investor would choose an equity fund rather than a bond fund or a cash investment. Which class of model would be most appropriate?\nChoices:\nA. A logit model\nB. A multinomial logit\nC. A tobit model\nD. An ordered logit model\nAnswer:", " An ordered logit model"], ["Question: Which of the following statements is TRUE concerning OLS estimation?\nChoices:\nA. OLS minimises the sum of the vertical distances from the points to the line\nB. OLS minimises the sum of the squares of the vertical distances from the points to the line\nC. OLS minimises the sum of the horizontal distances from the points to the line\nD. OLS minimises the sum of the squares of the horizontal distances from the points to the line.\nAnswer:", " OLS minimises the sum of the vertical distances from the points to the line"], ["Question: Which of the following statements is TRUE concerning OLS estimation?\nChoices:\nA. OLS minimises the sum of the vertical distances from the points to the line\nB. OLS minimises the sum of the squares of the vertical distances from the points to the line\nC. OLS minimises the sum of the horizontal distances from the points to the line\nD. OLS minimises the sum of the squares of the horizontal distances from the points to the line.\nAnswer:", " OLS minimises the sum of the squares of the vertical distances from the points to the line"], ["Question: Which of the following statements is TRUE concerning OLS estimation?\nChoices:\nA. OLS minimises the sum of the vertical distances from the points to the line\nB. OLS minimises the sum of the squares of the vertical distances from the points to the line\nC. OLS minimises the sum of the horizontal distances from the points to the line\nD. OLS minimises the sum of the squares of the horizontal distances from the points to the line.\nAnswer:", " OLS minimises the sum of the horizontal distances from the points to the line"], ["Question: Which of the following statements is TRUE concerning OLS estimation?\nChoices:\nA. OLS minimises the sum of the vertical distances from the points to the line\nB. OLS minimises the sum of the squares of the vertical distances from the points to the line\nC. OLS minimises the sum of the horizontal distances from the points to the line\nD. OLS minimises the sum of the squares of the horizontal distances from the points to the line.\nAnswer:", " OLS minimises the sum of the squares of the horizontal distances from the points to the line."]]
\ No newline at end of file
{"results": {"hendrycksTest-econometrics": {"acc": 0.1, "acc_stderr": 0.09999999999999999, "acc_norm": 0.1, "acc_norm_stderr": 0.09999999999999999}}, "versions": {"hendrycksTest-econometrics": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-econometrics": {"acc": 0.24561403508771928, "acc_norm": 0.24561403508771928, "acc_norm_stderr": 0.04049339297748142, "acc_stderr": 0.040493392977481425}}, "versions": {"hendrycksTest-econometrics": 0}}
\ No newline at end of file
b9b5d8b8bb02696302ec6bc2a99bf987a5504d3bae0e529d2c8f263538c97518
\ No newline at end of file
[["Question: The number of output pins in 8085 microprocessors are\nChoices:\nA. 27.\nB. 40.\nC. 21.\nD. 19.\nAnswer:", " 27."], ["Question: The number of output pins in 8085 microprocessors are\nChoices:\nA. 27.\nB. 40.\nC. 21.\nD. 19.\nAnswer:", " 40."], ["Question: The number of output pins in 8085 microprocessors are\nChoices:\nA. 27.\nB. 40.\nC. 21.\nD. 19.\nAnswer:", " 21."], ["Question: The number of output pins in 8085 microprocessors are\nChoices:\nA. 27.\nB. 40.\nC. 21.\nD. 19.\nAnswer:", " 19."], ["Question: Inintel 8085A microprocessor ALE signal is made high to\nChoices:\nA. Enable the data bus to be used as low order address bus\nB. To latch data D0-D7 from data bus\nC. To disable data bus\nD. To achieve all the functions listed above\nAnswer:", " Enable the data bus to be used as low order address bus"], ["Question: Inintel 8085A microprocessor ALE signal is made high to\nChoices:\nA. Enable the data bus to be used as low order address bus\nB. To latch data D0-D7 from data bus\nC. To disable data bus\nD. To achieve all the functions listed above\nAnswer:", " To latch data D0-D7 from data bus"], ["Question: Inintel 8085A microprocessor ALE signal is made high to\nChoices:\nA. Enable the data bus to be used as low order address bus\nB. To latch data D0-D7 from data bus\nC. To disable data bus\nD. To achieve all the functions listed above\nAnswer:", " To disable data bus"], ["Question: Inintel 8085A microprocessor ALE signal is made high to\nChoices:\nA. Enable the data bus to be used as low order address bus\nB. To latch data D0-D7 from data bus\nC. To disable data bus\nD. To achieve all the functions listed above\nAnswer:", " To achieve all the functions listed above"], ["Question: According to the Bohr model, an electron gains or losses energy only by\nChoices:\nA. moving faster or slower in an allowed orbit.\nB. jumping from one allowed orbit to another.\nC. being completely removed from an atom.\nD. jumping from one atom to another atom.\nAnswer:", " moving faster or slower in an allowed orbit."], ["Question: According to the Bohr model, an electron gains or losses energy only by\nChoices:\nA. moving faster or slower in an allowed orbit.\nB. jumping from one allowed orbit to another.\nC. being completely removed from an atom.\nD. jumping from one atom to another atom.\nAnswer:", " jumping from one allowed orbit to another."], ["Question: According to the Bohr model, an electron gains or losses energy only by\nChoices:\nA. moving faster or slower in an allowed orbit.\nB. jumping from one allowed orbit to another.\nC. being completely removed from an atom.\nD. jumping from one atom to another atom.\nAnswer:", " being completely removed from an atom."], ["Question: According to the Bohr model, an electron gains or losses energy only by\nChoices:\nA. moving faster or slower in an allowed orbit.\nB. jumping from one allowed orbit to another.\nC. being completely removed from an atom.\nD. jumping from one atom to another atom.\nAnswer:", " jumping from one atom to another atom."], ["Question: Systematic errors are\nChoices:\nA. environmental errors.\nB. observational errors.\nC. instrument errors.\nD. all of the above.\nAnswer:", " environmental errors."], ["Question: Systematic errors are\nChoices:\nA. environmental errors.\nB. observational errors.\nC. instrument errors.\nD. all of the above.\nAnswer:", " observational errors."], ["Question: Systematic errors are\nChoices:\nA. environmental errors.\nB. observational errors.\nC. instrument errors.\nD. all of the above.\nAnswer:", " instrument errors."], ["Question: Systematic errors are\nChoices:\nA. environmental errors.\nB. observational errors.\nC. instrument errors.\nD. all of the above.\nAnswer:", " all of the above."], ["Question: To obtain a high value of capacitance, the permittivity of dielectric medium should be\nChoices:\nA. low\nB. zero\nC. high\nD. unity\nAnswer:", " low"], ["Question: To obtain a high value of capacitance, the permittivity of dielectric medium should be\nChoices:\nA. low\nB. zero\nC. high\nD. unity\nAnswer:", " zero"], ["Question: To obtain a high value of capacitance, the permittivity of dielectric medium should be\nChoices:\nA. low\nB. zero\nC. high\nD. unity\nAnswer:", " high"], ["Question: To obtain a high value of capacitance, the permittivity of dielectric medium should be\nChoices:\nA. low\nB. zero\nC. high\nD. unity\nAnswer:", " unity"], ["Question: What are the sets of commands in a program which are not translated into machine instructions during assembly process, called?\nChoices:\nA. Mnemonics\nB. Directives\nC. Identifiers\nD. Operands\nAnswer:", " Mnemonics"], ["Question: What are the sets of commands in a program which are not translated into machine instructions during assembly process, called?\nChoices:\nA. Mnemonics\nB. Directives\nC. Identifiers\nD. Operands\nAnswer:", " Directives"], ["Question: What are the sets of commands in a program which are not translated into machine instructions during assembly process, called?\nChoices:\nA. Mnemonics\nB. Directives\nC. Identifiers\nD. Operands\nAnswer:", " Identifiers"], ["Question: What are the sets of commands in a program which are not translated into machine instructions during assembly process, called?\nChoices:\nA. Mnemonics\nB. Directives\nC. Identifiers\nD. Operands\nAnswer:", " Operands"], ["Question: Which of the following methods is/are used for reactive or voltage compensation\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. generation excitation control\nD. all of the above\nAnswer:", " shunt capacitor"], ["Question: Which of the following methods is/are used for reactive or voltage compensation\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. generation excitation control\nD. all of the above\nAnswer:", " series capacitor"], ["Question: Which of the following methods is/are used for reactive or voltage compensation\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. generation excitation control\nD. all of the above\nAnswer:", " generation excitation control"], ["Question: Which of the following methods is/are used for reactive or voltage compensation\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. generation excitation control\nD. all of the above\nAnswer:", " all of the above"], ["Question: Lowest critical frequency is due to pole and it may be present at the origin or nearer to the origin, then the type of network is\nChoices:\nA. LC.\nB. RL.\nC. RC.\nD. Any of the above.\nAnswer:", " LC."], ["Question: Lowest critical frequency is due to pole and it may be present at the origin or nearer to the origin, then the type of network is\nChoices:\nA. LC.\nB. RL.\nC. RC.\nD. Any of the above.\nAnswer:", " RL."], ["Question: Lowest critical frequency is due to pole and it may be present at the origin or nearer to the origin, then the type of network is\nChoices:\nA. LC.\nB. RL.\nC. RC.\nD. Any of the above.\nAnswer:", " RC."], ["Question: Lowest critical frequency is due to pole and it may be present at the origin or nearer to the origin, then the type of network is\nChoices:\nA. LC.\nB. RL.\nC. RC.\nD. Any of the above.\nAnswer:", " Any of the above."], ["Question: Stability of a transmission line can be increased by\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. shunt reactor\nD. both A and B\nAnswer:", " shunt capacitor"], ["Question: Stability of a transmission line can be increased by\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. shunt reactor\nD. both A and B\nAnswer:", " series capacitor"], ["Question: Stability of a transmission line can be increased by\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. shunt reactor\nD. both A and B\nAnswer:", " shunt reactor"], ["Question: Stability of a transmission line can be increased by\nChoices:\nA. shunt capacitor\nB. series capacitor\nC. shunt reactor\nD. both A and B\nAnswer:", " both A and B"], ["Question: If holding current of a thyristor is 2 mA then latching current should be\nChoices:\nA. 0.01 A.\nB. 0.002 A.\nC. 0.009 A.\nD. 0.004 A.\nAnswer:", " 0.01 A."], ["Question: If holding current of a thyristor is 2 mA then latching current should be\nChoices:\nA. 0.01 A.\nB. 0.002 A.\nC. 0.009 A.\nD. 0.004 A.\nAnswer:", " 0.002 A."], ["Question: If holding current of a thyristor is 2 mA then latching current should be\nChoices:\nA. 0.01 A.\nB. 0.002 A.\nC. 0.009 A.\nD. 0.004 A.\nAnswer:", " 0.009 A."], ["Question: If holding current of a thyristor is 2 mA then latching current should be\nChoices:\nA. 0.01 A.\nB. 0.002 A.\nC. 0.009 A.\nD. 0.004 A.\nAnswer:", " 0.004 A."]]
\ No newline at end of file
{"results": {"hendrycksTest-electrical_engineering": {"acc": 0.2, "acc_stderr": 0.13333333333333333, "acc_norm": 0.2, "acc_norm_stderr": 0.13333333333333333}}, "versions": {"hendrycksTest-electrical_engineering": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-electrical_engineering": {"acc": 0.2689655172413793, "acc_norm": 0.2827586206896552, "acc_norm_stderr": 0.037528339580033376, "acc_stderr": 0.036951833116502325}}, "versions": {"hendrycksTest-electrical_engineering": 0}}
\ No newline at end of file
6b21f5cd5606268421a667152ec989424b66905c02adbab8d4ff6bb9d21b77d1
\ No newline at end of file
[["Question: Find 13 over 14 + 7 over 14.\nChoices:\nA. 1 and 4 over 7\nB. 1 and 3 over 7\nC. 1 and 5 over 14\nD. 20 over 28\nAnswer:", " 1 and 4 over 7"], ["Question: Find 13 over 14 + 7 over 14.\nChoices:\nA. 1 and 4 over 7\nB. 1 and 3 over 7\nC. 1 and 5 over 14\nD. 20 over 28\nAnswer:", " 1 and 3 over 7"], ["Question: Find 13 over 14 + 7 over 14.\nChoices:\nA. 1 and 4 over 7\nB. 1 and 3 over 7\nC. 1 and 5 over 14\nD. 20 over 28\nAnswer:", " 1 and 5 over 14"], ["Question: Find 13 over 14 + 7 over 14.\nChoices:\nA. 1 and 4 over 7\nB. 1 and 3 over 7\nC. 1 and 5 over 14\nD. 20 over 28\nAnswer:", " 20 over 28"], ["Question: Conor made 9 shapes with straws. Each shape had 5 straws. Conor used 15 more straws to make more shapes. Whatis the total number of straws Conor used to make all the shapes?\nChoices:\nA. 20\nB. 29\nC. 45\nD. 60\nAnswer:", " 20"], ["Question: Conor made 9 shapes with straws. Each shape had 5 straws. Conor used 15 more straws to make more shapes. Whatis the total number of straws Conor used to make all the shapes?\nChoices:\nA. 20\nB. 29\nC. 45\nD. 60\nAnswer:", " 29"], ["Question: Conor made 9 shapes with straws. Each shape had 5 straws. Conor used 15 more straws to make more shapes. Whatis the total number of straws Conor used to make all the shapes?\nChoices:\nA. 20\nB. 29\nC. 45\nD. 60\nAnswer:", " 45"], ["Question: Conor made 9 shapes with straws. Each shape had 5 straws. Conor used 15 more straws to make more shapes. Whatis the total number of straws Conor used to make all the shapes?\nChoices:\nA. 20\nB. 29\nC. 45\nD. 60\nAnswer:", " 60"], ["Question: If a freight train travels at a speed of 20 miles per hour for 6 hours, how far will it travel?\nChoices:\nA. 120 miles\nB. 80 miles\nC. 26 miles\nD. 12 miles\nAnswer:", " 120 miles"], ["Question: If a freight train travels at a speed of 20 miles per hour for 6 hours, how far will it travel?\nChoices:\nA. 120 miles\nB. 80 miles\nC. 26 miles\nD. 12 miles\nAnswer:", " 80 miles"], ["Question: If a freight train travels at a speed of 20 miles per hour for 6 hours, how far will it travel?\nChoices:\nA. 120 miles\nB. 80 miles\nC. 26 miles\nD. 12 miles\nAnswer:", " 26 miles"], ["Question: If a freight train travels at a speed of 20 miles per hour for 6 hours, how far will it travel?\nChoices:\nA. 120 miles\nB. 80 miles\nC. 26 miles\nD. 12 miles\nAnswer:", " 12 miles"], ["Question: A worker on an assembly line takes 7 hours to produce 22 parts. At that rate how many parts can she produce in 35 hours?\nChoices:\nA. 220 parts\nB. 770 parts\nC. 4 parts\nD. 110 parts\nAnswer:", " 220 parts"], ["Question: A worker on an assembly line takes 7 hours to produce 22 parts. At that rate how many parts can she produce in 35 hours?\nChoices:\nA. 220 parts\nB. 770 parts\nC. 4 parts\nD. 110 parts\nAnswer:", " 770 parts"], ["Question: A worker on an assembly line takes 7 hours to produce 22 parts. At that rate how many parts can she produce in 35 hours?\nChoices:\nA. 220 parts\nB. 770 parts\nC. 4 parts\nD. 110 parts\nAnswer:", " 4 parts"], ["Question: A worker on an assembly line takes 7 hours to produce 22 parts. At that rate how many parts can she produce in 35 hours?\nChoices:\nA. 220 parts\nB. 770 parts\nC. 4 parts\nD. 110 parts\nAnswer:", " 110 parts"], ["Question: There were 6 rows of chairs set up for a meeting. Each row had 8 chairs. What was the total number of chairs set up for the meeting?\nChoices:\nA. 14\nB. 36\nC. 48\nD. 64\nAnswer:", " 14"], ["Question: There were 6 rows of chairs set up for a meeting. Each row had 8 chairs. What was the total number of chairs set up for the meeting?\nChoices:\nA. 14\nB. 36\nC. 48\nD. 64\nAnswer:", " 36"], ["Question: There were 6 rows of chairs set up for a meeting. Each row had 8 chairs. What was the total number of chairs set up for the meeting?\nChoices:\nA. 14\nB. 36\nC. 48\nD. 64\nAnswer:", " 48"], ["Question: There were 6 rows of chairs set up for a meeting. Each row had 8 chairs. What was the total number of chairs set up for the meeting?\nChoices:\nA. 14\nB. 36\nC. 48\nD. 64\nAnswer:", " 64"], ["Question: Estimate 711 + 497. The sum is between which numbers?\nChoices:\nA. 50 and 400\nB. 450 and 700\nC. 750 and 1,000\nD. 1,050 and 1,300\nAnswer:", " 50 and 400"], ["Question: Estimate 711 + 497. The sum is between which numbers?\nChoices:\nA. 50 and 400\nB. 450 and 700\nC. 750 and 1,000\nD. 1,050 and 1,300\nAnswer:", " 450 and 700"], ["Question: Estimate 711 + 497. The sum is between which numbers?\nChoices:\nA. 50 and 400\nB. 450 and 700\nC. 750 and 1,000\nD. 1,050 and 1,300\nAnswer:", " 750 and 1,000"], ["Question: Estimate 711 + 497. The sum is between which numbers?\nChoices:\nA. 50 and 400\nB. 450 and 700\nC. 750 and 1,000\nD. 1,050 and 1,300\nAnswer:", " 1,050 and 1,300"], ["Question: Miranda enlarged a picture proportionally. Her original picture is 4 cm wide and 6 cm long. If the new, larger picture is 10 cm wide, what is its length?\nChoices:\nA. 8 cm\nB. 12 cm\nC. 15 cm\nD. 20 cm\nAnswer:", " 8 cm"], ["Question: Miranda enlarged a picture proportionally. Her original picture is 4 cm wide and 6 cm long. If the new, larger picture is 10 cm wide, what is its length?\nChoices:\nA. 8 cm\nB. 12 cm\nC. 15 cm\nD. 20 cm\nAnswer:", " 12 cm"], ["Question: Miranda enlarged a picture proportionally. Her original picture is 4 cm wide and 6 cm long. If the new, larger picture is 10 cm wide, what is its length?\nChoices:\nA. 8 cm\nB. 12 cm\nC. 15 cm\nD. 20 cm\nAnswer:", " 15 cm"], ["Question: Miranda enlarged a picture proportionally. Her original picture is 4 cm wide and 6 cm long. If the new, larger picture is 10 cm wide, what is its length?\nChoices:\nA. 8 cm\nB. 12 cm\nC. 15 cm\nD. 20 cm\nAnswer:", " 20 cm"], ["Question: Ms. Fisher used the expression (6 \u00d7 8) \u00d7 12 to find the total number of markers needed for her students\u2019 art project. Which expression is equal to the one used by Ms. Fisher?\nChoices:\nA. 6 + (8 + 12)\nB. 6 + (8 \u00d7 12)\nC. 6 \u00d7 (8 + 12)\nD. 6 \u00d7 (8 \u00d7 12)\nAnswer:", " 6 + (8 + 12)"], ["Question: Ms. Fisher used the expression (6 \u00d7 8) \u00d7 12 to find the total number of markers needed for her students\u2019 art project. Which expression is equal to the one used by Ms. Fisher?\nChoices:\nA. 6 + (8 + 12)\nB. 6 + (8 \u00d7 12)\nC. 6 \u00d7 (8 + 12)\nD. 6 \u00d7 (8 \u00d7 12)\nAnswer:", " 6 + (8 \u00d7 12)"], ["Question: Ms. Fisher used the expression (6 \u00d7 8) \u00d7 12 to find the total number of markers needed for her students\u2019 art project. Which expression is equal to the one used by Ms. Fisher?\nChoices:\nA. 6 + (8 + 12)\nB. 6 + (8 \u00d7 12)\nC. 6 \u00d7 (8 + 12)\nD. 6 \u00d7 (8 \u00d7 12)\nAnswer:", " 6 \u00d7 (8 + 12)"], ["Question: Ms. Fisher used the expression (6 \u00d7 8) \u00d7 12 to find the total number of markers needed for her students\u2019 art project. Which expression is equal to the one used by Ms. Fisher?\nChoices:\nA. 6 + (8 + 12)\nB. 6 + (8 \u00d7 12)\nC. 6 \u00d7 (8 + 12)\nD. 6 \u00d7 (8 \u00d7 12)\nAnswer:", " 6 \u00d7 (8 \u00d7 12)"], ["Question: Estimate 999 - 103. The difference is between which numbers?\nChoices:\nA. 1,300 and 1,500\nB. 1,000 and 1,200\nC. 700 and 900\nD. 400 and 600\nAnswer:", " 1,300 and 1,500"], ["Question: Estimate 999 - 103. The difference is between which numbers?\nChoices:\nA. 1,300 and 1,500\nB. 1,000 and 1,200\nC. 700 and 900\nD. 400 and 600\nAnswer:", " 1,000 and 1,200"], ["Question: Estimate 999 - 103. The difference is between which numbers?\nChoices:\nA. 1,300 and 1,500\nB. 1,000 and 1,200\nC. 700 and 900\nD. 400 and 600\nAnswer:", " 700 and 900"], ["Question: Estimate 999 - 103. The difference is between which numbers?\nChoices:\nA. 1,300 and 1,500\nB. 1,000 and 1,200\nC. 700 and 900\nD. 400 and 600\nAnswer:", " 400 and 600"], ["Question: There are 7 desks arranged in a row in Mr. Thompson\u2019s classroom. Hector sits 2 seats to the right of Kim. Tonya sits 3 seats to the right of Hector. How many seats to the left of Tonya does Kim sit?\nChoices:\nA. 2\nB. 3\nC. 5\nD. 12\nAnswer:", " 2"], ["Question: There are 7 desks arranged in a row in Mr. Thompson\u2019s classroom. Hector sits 2 seats to the right of Kim. Tonya sits 3 seats to the right of Hector. How many seats to the left of Tonya does Kim sit?\nChoices:\nA. 2\nB. 3\nC. 5\nD. 12\nAnswer:", " 3"], ["Question: There are 7 desks arranged in a row in Mr. Thompson\u2019s classroom. Hector sits 2 seats to the right of Kim. Tonya sits 3 seats to the right of Hector. How many seats to the left of Tonya does Kim sit?\nChoices:\nA. 2\nB. 3\nC. 5\nD. 12\nAnswer:", " 5"], ["Question: There are 7 desks arranged in a row in Mr. Thompson\u2019s classroom. Hector sits 2 seats to the right of Kim. Tonya sits 3 seats to the right of Hector. How many seats to the left of Tonya does Kim sit?\nChoices:\nA. 2\nB. 3\nC. 5\nD. 12\nAnswer:", " 12"]]
\ No newline at end of file
{"results": {"hendrycksTest-elementary_mathematics": {"acc": 0.5, "acc_stderr": 0.16666666666666666, "acc_norm": 0.5, "acc_norm_stderr": 0.16666666666666666}}, "versions": {"hendrycksTest-elementary_mathematics": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-elementary_mathematics": {"acc": 0.2724867724867725, "acc_norm": 0.2830687830687831, "acc_norm_stderr": 0.023201392938194978, "acc_stderr": 0.022930973071633345}}, "versions": {"hendrycksTest-elementary_mathematics": 0}}
\ No newline at end of file
c0d0f0c008a5f3faf2f6f4268d87bbc09c40bb66ae08cf38eea0bf2e519c5a59
\ No newline at end of file
[["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEverybody loves Raymond, or not.\nChoices:\nA. L\nB. ~L\nC. L \u2022 ~L\nD. L \u2228 ~L\nAnswer:", " L"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEverybody loves Raymond, or not.\nChoices:\nA. L\nB. ~L\nC. L \u2022 ~L\nD. L \u2228 ~L\nAnswer:", " ~L"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEverybody loves Raymond, or not.\nChoices:\nA. L\nB. ~L\nC. L \u2022 ~L\nD. L \u2228 ~L\nAnswer:", " L \u2022 ~L"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEverybody loves Raymond, or not.\nChoices:\nA. L\nB. ~L\nC. L \u2022 ~L\nD. L \u2228 ~L\nAnswer:", " L \u2228 ~L"], ["Question: Construct a complete truth table for the following argument. Then, using the truth table, determine whether the argument is valid or invalid. If the argument is invalid, choose an option which presents a counterexample. (There may be other counterexamples as well.)\nQ \u2261 R\n~(S \u2228 Q) / R\nChoices:\nA. Valid\nB. Invalid. Counterexample when Q and S are true and R is false\nC. Invalid. Counterexample when Q is true and S and R are false\nD. Invalid. Counterexample when Q, S, and R are false\nAnswer:", " Valid"], ["Question: Construct a complete truth table for the following argument. Then, using the truth table, determine whether the argument is valid or invalid. If the argument is invalid, choose an option which presents a counterexample. (There may be other counterexamples as well.)\nQ \u2261 R\n~(S \u2228 Q) / R\nChoices:\nA. Valid\nB. Invalid. Counterexample when Q and S are true and R is false\nC. Invalid. Counterexample when Q is true and S and R are false\nD. Invalid. Counterexample when Q, S, and R are false\nAnswer:", " Invalid. Counterexample when Q and S are true and R is false"], ["Question: Construct a complete truth table for the following argument. Then, using the truth table, determine whether the argument is valid or invalid. If the argument is invalid, choose an option which presents a counterexample. (There may be other counterexamples as well.)\nQ \u2261 R\n~(S \u2228 Q) / R\nChoices:\nA. Valid\nB. Invalid. Counterexample when Q and S are true and R is false\nC. Invalid. Counterexample when Q is true and S and R are false\nD. Invalid. Counterexample when Q, S, and R are false\nAnswer:", " Invalid. Counterexample when Q is true and S and R are false"], ["Question: Construct a complete truth table for the following argument. Then, using the truth table, determine whether the argument is valid or invalid. If the argument is invalid, choose an option which presents a counterexample. (There may be other counterexamples as well.)\nQ \u2261 R\n~(S \u2228 Q) / R\nChoices:\nA. Valid\nB. Invalid. Counterexample when Q and S are true and R is false\nC. Invalid. Counterexample when Q is true and S and R are false\nD. Invalid. Counterexample when Q, S, and R are false\nAnswer:", " Invalid. Counterexample when Q, S, and R are false"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEither England's importing beef is a necessary condition for France's subsidizing agriculture or China's promoting human rights is not a sufficient condition for South Africa's supplying diamonds.\nChoices:\nA. (E \u2261 F) \u2228 ~(C \u2261 S)\nB. (E \u2261 F) \u2228 (~C \u2283 S)\nC. (E \u2283 F) \u2228 ~(C \u2283 S)\nD. (F \u2283 E) \u2228 ~(C \u2283 S)\nAnswer:", " (E \u2261 F) \u2228 ~(C \u2261 S)"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEither England's importing beef is a necessary condition for France's subsidizing agriculture or China's promoting human rights is not a sufficient condition for South Africa's supplying diamonds.\nChoices:\nA. (E \u2261 F) \u2228 ~(C \u2261 S)\nB. (E \u2261 F) \u2228 (~C \u2283 S)\nC. (E \u2283 F) \u2228 ~(C \u2283 S)\nD. (F \u2283 E) \u2228 ~(C \u2283 S)\nAnswer:", " (E \u2261 F) \u2228 (~C \u2283 S)"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEither England's importing beef is a necessary condition for France's subsidizing agriculture or China's promoting human rights is not a sufficient condition for South Africa's supplying diamonds.\nChoices:\nA. (E \u2261 F) \u2228 ~(C \u2261 S)\nB. (E \u2261 F) \u2228 (~C \u2283 S)\nC. (E \u2283 F) \u2228 ~(C \u2283 S)\nD. (F \u2283 E) \u2228 ~(C \u2283 S)\nAnswer:", " (E \u2283 F) \u2228 ~(C \u2283 S)"], ["Question: Which of the given formulas of PL is the best symbolization of the following sentence?\nEither England's importing beef is a necessary condition for France's subsidizing agriculture or China's promoting human rights is not a sufficient condition for South Africa's supplying diamonds.\nChoices:\nA. (E \u2261 F) \u2228 ~(C \u2261 S)\nB. (E \u2261 F) \u2228 (~C \u2283 S)\nC. (E \u2283 F) \u2228 ~(C \u2283 S)\nD. (F \u2283 E) \u2228 ~(C \u2283 S)\nAnswer:", " (F \u2283 E) \u2228 ~(C \u2283 S)"], ["Question: Select the best translation into predicate logic: Ms. Jackson lent me some sugar, but neither Janet nor Latoya did.\nChoices:\nA. Lm \u2228 ~(Jl \u2228 Ll)\nB. Lm \u2022 ~(Lj \u2022 Ll)\nC. Lm \u2022 ~(Lj \u2228 Ll)\nD. Lm \u2228 ~(Lj \u2022 Ll)\nAnswer:", " Lm \u2228 ~(Jl \u2228 Ll)"], ["Question: Select the best translation into predicate logic: Ms. Jackson lent me some sugar, but neither Janet nor Latoya did.\nChoices:\nA. Lm \u2228 ~(Jl \u2228 Ll)\nB. Lm \u2022 ~(Lj \u2022 Ll)\nC. Lm \u2022 ~(Lj \u2228 Ll)\nD. Lm \u2228 ~(Lj \u2022 Ll)\nAnswer:", " Lm \u2022 ~(Lj \u2022 Ll)"], ["Question: Select the best translation into predicate logic: Ms. Jackson lent me some sugar, but neither Janet nor Latoya did.\nChoices:\nA. Lm \u2228 ~(Jl \u2228 Ll)\nB. Lm \u2022 ~(Lj \u2022 Ll)\nC. Lm \u2022 ~(Lj \u2228 Ll)\nD. Lm \u2228 ~(Lj \u2022 Ll)\nAnswer:", " Lm \u2022 ~(Lj \u2228 Ll)"], ["Question: Select the best translation into predicate logic: Ms. Jackson lent me some sugar, but neither Janet nor Latoya did.\nChoices:\nA. Lm \u2228 ~(Jl \u2228 Ll)\nB. Lm \u2022 ~(Lj \u2022 Ll)\nC. Lm \u2022 ~(Lj \u2228 Ll)\nD. Lm \u2228 ~(Lj \u2022 Ll)\nAnswer:", " Lm \u2228 ~(Lj \u2022 Ll)"], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nDh \u2283 ~Pt\n(\u2200x)Px \u2228 (\u2200x)Mx\n~Mb\t/ ~Dh\nChoices:\nA. If my headache is dualist state, then your tickle is a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nB. If my headache is dualist state, then your tickle is not a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nC. If my headache is dualist state, then your tickle is not a physical state. If everything is physical then everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nD. If my headache is dualist state, then your tickle is not a physical state. Everything is either physical or mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nAnswer:", " If my headache is dualist state, then your tickle is a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nDh \u2283 ~Pt\n(\u2200x)Px \u2228 (\u2200x)Mx\n~Mb\t/ ~Dh\nChoices:\nA. If my headache is dualist state, then your tickle is a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nB. If my headache is dualist state, then your tickle is not a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nC. If my headache is dualist state, then your tickle is not a physical state. If everything is physical then everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nD. If my headache is dualist state, then your tickle is not a physical state. Everything is either physical or mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nAnswer:", " If my headache is dualist state, then your tickle is not a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nDh \u2283 ~Pt\n(\u2200x)Px \u2228 (\u2200x)Mx\n~Mb\t/ ~Dh\nChoices:\nA. If my headache is dualist state, then your tickle is a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nB. If my headache is dualist state, then your tickle is not a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nC. If my headache is dualist state, then your tickle is not a physical state. If everything is physical then everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nD. If my headache is dualist state, then your tickle is not a physical state. Everything is either physical or mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nAnswer:", " If my headache is dualist state, then your tickle is not a physical state. If everything is physical then everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nDh \u2283 ~Pt\n(\u2200x)Px \u2228 (\u2200x)Mx\n~Mb\t/ ~Dh\nChoices:\nA. If my headache is dualist state, then your tickle is a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nB. If my headache is dualist state, then your tickle is not a physical state. Either everything is physical or everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nC. If my headache is dualist state, then your tickle is not a physical state. If everything is physical then everything is mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nD. If my headache is dualist state, then your tickle is not a physical state. Everything is either physical or mental. But my broken toe is not a mental state. So my headache is not a dualist state.\nAnswer:", " If my headache is dualist state, then your tickle is not a physical state. Everything is either physical or mental. But my broken toe is not a mental state. So my headache is not a dualist state."], ["Question: Identify the antecedent of the following conditional proposition: The university raises tuition if, and only if, both the governor approves of it and the board of trustees recommends it.\nChoices:\nA. The university raises tuition.\nB. The governor approves of it.\nC. The board of trustees recommends it.\nD. None of the above\nAnswer:", " The university raises tuition."], ["Question: Identify the antecedent of the following conditional proposition: The university raises tuition if, and only if, both the governor approves of it and the board of trustees recommends it.\nChoices:\nA. The university raises tuition.\nB. The governor approves of it.\nC. The board of trustees recommends it.\nD. None of the above\nAnswer:", " The governor approves of it."], ["Question: Identify the antecedent of the following conditional proposition: The university raises tuition if, and only if, both the governor approves of it and the board of trustees recommends it.\nChoices:\nA. The university raises tuition.\nB. The governor approves of it.\nC. The board of trustees recommends it.\nD. None of the above\nAnswer:", " The board of trustees recommends it."], ["Question: Identify the antecedent of the following conditional proposition: The university raises tuition if, and only if, both the governor approves of it and the board of trustees recommends it.\nChoices:\nA. The university raises tuition.\nB. The governor approves of it.\nC. The board of trustees recommends it.\nD. None of the above\nAnswer:", " None of the above"], ["Question: Use the following key to translate the given formula of PL to natural, English sentences.\nA: Marina reads a Percy Jackson book.\nB: Izzy plays Minecraft.\nC: Emily stops working.\nD: Russell makes dinner.\nE: Ashleigh stops by.\n~C \u2228 D\nChoices:\nA. If Emily doesn't stop working then Russell makes dinner.\nB. Emily stops working unless Russell makes dinner.\nC. Emily stops working unless Russell doesn't make dinner.\nD. Emily doesn't stop working unless Russell makes dinner.\nAnswer:", " If Emily doesn't stop working then Russell makes dinner."], ["Question: Use the following key to translate the given formula of PL to natural, English sentences.\nA: Marina reads a Percy Jackson book.\nB: Izzy plays Minecraft.\nC: Emily stops working.\nD: Russell makes dinner.\nE: Ashleigh stops by.\n~C \u2228 D\nChoices:\nA. If Emily doesn't stop working then Russell makes dinner.\nB. Emily stops working unless Russell makes dinner.\nC. Emily stops working unless Russell doesn't make dinner.\nD. Emily doesn't stop working unless Russell makes dinner.\nAnswer:", " Emily stops working unless Russell makes dinner."], ["Question: Use the following key to translate the given formula of PL to natural, English sentences.\nA: Marina reads a Percy Jackson book.\nB: Izzy plays Minecraft.\nC: Emily stops working.\nD: Russell makes dinner.\nE: Ashleigh stops by.\n~C \u2228 D\nChoices:\nA. If Emily doesn't stop working then Russell makes dinner.\nB. Emily stops working unless Russell makes dinner.\nC. Emily stops working unless Russell doesn't make dinner.\nD. Emily doesn't stop working unless Russell makes dinner.\nAnswer:", " Emily stops working unless Russell doesn't make dinner."], ["Question: Use the following key to translate the given formula of PL to natural, English sentences.\nA: Marina reads a Percy Jackson book.\nB: Izzy plays Minecraft.\nC: Emily stops working.\nD: Russell makes dinner.\nE: Ashleigh stops by.\n~C \u2228 D\nChoices:\nA. If Emily doesn't stop working then Russell makes dinner.\nB. Emily stops working unless Russell makes dinner.\nC. Emily stops working unless Russell doesn't make dinner.\nD. Emily doesn't stop working unless Russell makes dinner.\nAnswer:", " Emily doesn't stop working unless Russell makes dinner."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nWn \u2228 Wm\n(\u2200x)[Lx \u2283 (Dx \u2283 ~Wx)]\nLn \u2022 Dn\t/ ~(\u2200x)~Wx\nChoices:\nA. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is not at work.\nB. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is not at work.\nC. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is at work.\nD. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is at work.\nAnswer:", " Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is not at work."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nWn \u2228 Wm\n(\u2200x)[Lx \u2283 (Dx \u2283 ~Wx)]\nLn \u2022 Dn\t/ ~(\u2200x)~Wx\nChoices:\nA. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is not at work.\nB. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is not at work.\nC. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is at work.\nD. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is at work.\nAnswer:", " Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is not at work."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nWn \u2228 Wm\n(\u2200x)[Lx \u2283 (Dx \u2283 ~Wx)]\nLn \u2022 Dn\t/ ~(\u2200x)~Wx\nChoices:\nA. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is not at work.\nB. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is not at work.\nC. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is at work.\nD. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is at work.\nAnswer:", " Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is at work."], ["Question: Select the best English interpretation of the given arguments in predicate logic.\nWn \u2228 Wm\n(\u2200x)[Lx \u2283 (Dx \u2283 ~Wx)]\nLn \u2022 Dn\t/ ~(\u2200x)~Wx\nChoices:\nA. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is not at work.\nB. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is not at work.\nC. Either Nancy or Marvin are at work. All lawyers are out to dinner if they are not at work. Nancy is a lawyer and out to dinner. So not everything is at work.\nD. Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is at work.\nAnswer:", " Either Nancy or Marvin are at work. All lawyers are not at work if they are out to dinner. Nancy is a lawyer and out to dinner. So not everything is at work."], ["Question: Identify the conclusion of the following argument. Remember to remove any conclusion indicators. It is wrong for society to kill a murderer. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer. And a murderer is wrong in killing his victim.\nChoices:\nA. It is wrong for society to kill a murderer.\nB. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nC. If a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nD. And a murderer is wrong in killing his victim.\nAnswer:", " It is wrong for society to kill a murderer."], ["Question: Identify the conclusion of the following argument. Remember to remove any conclusion indicators. It is wrong for society to kill a murderer. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer. And a murderer is wrong in killing his victim.\nChoices:\nA. It is wrong for society to kill a murderer.\nB. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nC. If a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nD. And a murderer is wrong in killing his victim.\nAnswer:", " This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer."], ["Question: Identify the conclusion of the following argument. Remember to remove any conclusion indicators. It is wrong for society to kill a murderer. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer. And a murderer is wrong in killing his victim.\nChoices:\nA. It is wrong for society to kill a murderer.\nB. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nC. If a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nD. And a murderer is wrong in killing his victim.\nAnswer:", " If a murderer is wrong in killing his victim, then society is also wrong in killing the murderer."], ["Question: Identify the conclusion of the following argument. Remember to remove any conclusion indicators. It is wrong for society to kill a murderer. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer. And a murderer is wrong in killing his victim.\nChoices:\nA. It is wrong for society to kill a murderer.\nB. This follows for the reason that if a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nC. If a murderer is wrong in killing his victim, then society is also wrong in killing the murderer.\nD. And a murderer is wrong in killing his victim.\nAnswer:", " And a murderer is wrong in killing his victim."], ["Question: Select the best translation into predicate logic. All children go to some school. (Cx: x is a child; Sx: x is a school; Gxy: x goes to y)\nChoices:\nA. (\u2200x)(\u2203y)[(Cx \u2022 Sy) \u2022 Gxy)\nB. (\u2200x)[Sx \u2283 (\u2203y)(Cy \u2022 Gxy)]\nC. (\u2200x)[Cx \u2283 (\u2203y)(Sy \u2022 Gxy)]\nD. (\u2200x)[Sx \u2022 (\u2203y)(Cy \u2022 Gxy)]\nAnswer:", " (\u2200x)(\u2203y)[(Cx \u2022 Sy) \u2022 Gxy)"], ["Question: Select the best translation into predicate logic. All children go to some school. (Cx: x is a child; Sx: x is a school; Gxy: x goes to y)\nChoices:\nA. (\u2200x)(\u2203y)[(Cx \u2022 Sy) \u2022 Gxy)\nB. (\u2200x)[Sx \u2283 (\u2203y)(Cy \u2022 Gxy)]\nC. (\u2200x)[Cx \u2283 (\u2203y)(Sy \u2022 Gxy)]\nD. (\u2200x)[Sx \u2022 (\u2203y)(Cy \u2022 Gxy)]\nAnswer:", " (\u2200x)[Sx \u2283 (\u2203y)(Cy \u2022 Gxy)]"], ["Question: Select the best translation into predicate logic. All children go to some school. (Cx: x is a child; Sx: x is a school; Gxy: x goes to y)\nChoices:\nA. (\u2200x)(\u2203y)[(Cx \u2022 Sy) \u2022 Gxy)\nB. (\u2200x)[Sx \u2283 (\u2203y)(Cy \u2022 Gxy)]\nC. (\u2200x)[Cx \u2283 (\u2203y)(Sy \u2022 Gxy)]\nD. (\u2200x)[Sx \u2022 (\u2203y)(Cy \u2022 Gxy)]\nAnswer:", " (\u2200x)[Cx \u2283 (\u2203y)(Sy \u2022 Gxy)]"], ["Question: Select the best translation into predicate logic. All children go to some school. (Cx: x is a child; Sx: x is a school; Gxy: x goes to y)\nChoices:\nA. (\u2200x)(\u2203y)[(Cx \u2022 Sy) \u2022 Gxy)\nB. (\u2200x)[Sx \u2283 (\u2203y)(Cy \u2022 Gxy)]\nC. (\u2200x)[Cx \u2283 (\u2203y)(Sy \u2022 Gxy)]\nD. (\u2200x)[Sx \u2022 (\u2203y)(Cy \u2022 Gxy)]\nAnswer:", " (\u2200x)[Sx \u2022 (\u2203y)(Cy \u2022 Gxy)]"]]
\ No newline at end of file
{"results": {"hendrycksTest-formal_logic": {"acc": 0.2, "acc_stderr": 0.13333333333333333, "acc_norm": 0.2, "acc_norm_stderr": 0.13333333333333333}}, "versions": {"hendrycksTest-formal_logic": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-formal_logic": {"acc": 0.25396825396825395, "acc_norm": 0.2698412698412698, "acc_norm_stderr": 0.03970158273235172, "acc_stderr": 0.03893259610604674}}, "versions": {"hendrycksTest-formal_logic": 0}}
\ No newline at end of file
9fdc85240b8170839278b1e883ee0868611d84dce202cb8aa037c841ec76d089
\ No newline at end of file
[["Question: As of 2019, which of the following had the lowest life expectancy?\nChoices:\nA. Brazil\nB. China\nC. Mexico\nD. Iran\nAnswer:", " Brazil"], ["Question: As of 2019, which of the following had the lowest life expectancy?\nChoices:\nA. Brazil\nB. China\nC. Mexico\nD. Iran\nAnswer:", " China"], ["Question: As of 2019, which of the following had the lowest life expectancy?\nChoices:\nA. Brazil\nB. China\nC. Mexico\nD. Iran\nAnswer:", " Mexico"], ["Question: As of 2019, which of the following had the lowest life expectancy?\nChoices:\nA. Brazil\nB. China\nC. Mexico\nD. Iran\nAnswer:", " Iran"], ["Question: About what percentage of survey respondents from China report having paid a bribe in the last year to access public services (such as education; judiciary; medical and health; police; registry and permit services; utilities; tax revenue and customs; and land service) as of 2017?\nChoices:\nA. 26%\nB. 46%\nC. 66%\nD. 86%\nAnswer:", " 26%"], ["Question: About what percentage of survey respondents from China report having paid a bribe in the last year to access public services (such as education; judiciary; medical and health; police; registry and permit services; utilities; tax revenue and customs; and land service) as of 2017?\nChoices:\nA. 26%\nB. 46%\nC. 66%\nD. 86%\nAnswer:", " 46%"], ["Question: About what percentage of survey respondents from China report having paid a bribe in the last year to access public services (such as education; judiciary; medical and health; police; registry and permit services; utilities; tax revenue and customs; and land service) as of 2017?\nChoices:\nA. 26%\nB. 46%\nC. 66%\nD. 86%\nAnswer:", " 66%"], ["Question: About what percentage of survey respondents from China report having paid a bribe in the last year to access public services (such as education; judiciary; medical and health; police; registry and permit services; utilities; tax revenue and customs; and land service) as of 2017?\nChoices:\nA. 26%\nB. 46%\nC. 66%\nD. 86%\nAnswer:", " 86%"], ["Question: At the time of independence, there were already hundreds of thousands of university graduates in ____, but hardly any at all in ____.\nChoices:\nA. India, Congo\nB. India, South Korea\nC. Congo, South Korea\nD. South Korea, India\nAnswer:", " India, Congo"], ["Question: At the time of independence, there were already hundreds of thousands of university graduates in ____, but hardly any at all in ____.\nChoices:\nA. India, Congo\nB. India, South Korea\nC. Congo, South Korea\nD. South Korea, India\nAnswer:", " India, South Korea"], ["Question: At the time of independence, there were already hundreds of thousands of university graduates in ____, but hardly any at all in ____.\nChoices:\nA. India, Congo\nB. India, South Korea\nC. Congo, South Korea\nD. South Korea, India\nAnswer:", " Congo, South Korea"], ["Question: At the time of independence, there were already hundreds of thousands of university graduates in ____, but hardly any at all in ____.\nChoices:\nA. India, Congo\nB. India, South Korea\nC. Congo, South Korea\nD. South Korea, India\nAnswer:", " South Korea, India"], ["Question: As of 2017, what fraction of the population in India used the internet in the past three months?\nChoices:\nA. 11%\nB. 26%\nC. 41%\nD. 56%\nAnswer:", " 11%"], ["Question: As of 2017, what fraction of the population in India used the internet in the past three months?\nChoices:\nA. 11%\nB. 26%\nC. 41%\nD. 56%\nAnswer:", " 26%"], ["Question: As of 2017, what fraction of the population in India used the internet in the past three months?\nChoices:\nA. 11%\nB. 26%\nC. 41%\nD. 56%\nAnswer:", " 41%"], ["Question: As of 2017, what fraction of the population in India used the internet in the past three months?\nChoices:\nA. 11%\nB. 26%\nC. 41%\nD. 56%\nAnswer:", " 56%"], ["Question: In 2017, about how many people died from terrorism globally?\nChoices:\nA. 260\nB. 2,600\nC. 26,000\nD. 260,000\nAnswer:", " 260"], ["Question: In 2017, about how many people died from terrorism globally?\nChoices:\nA. 260\nB. 2,600\nC. 26,000\nD. 260,000\nAnswer:", " 2,600"], ["Question: In 2017, about how many people died from terrorism globally?\nChoices:\nA. 260\nB. 2,600\nC. 26,000\nD. 260,000\nAnswer:", " 26,000"], ["Question: In 2017, about how many people died from terrorism globally?\nChoices:\nA. 260\nB. 2,600\nC. 26,000\nD. 260,000\nAnswer:", " 260,000"], ["Question: As of 2017, the share of GDP spent on the military by China is about\nChoices:\nA. 0.50%\nB. 2%\nC. 6%\nD. 12%\nAnswer:", " 0.50%"], ["Question: As of 2017, the share of GDP spent on the military by China is about\nChoices:\nA. 0.50%\nB. 2%\nC. 6%\nD. 12%\nAnswer:", " 2%"], ["Question: As of 2017, the share of GDP spent on the military by China is about\nChoices:\nA. 0.50%\nB. 2%\nC. 6%\nD. 12%\nAnswer:", " 6%"], ["Question: As of 2017, the share of GDP spent on the military by China is about\nChoices:\nA. 0.50%\nB. 2%\nC. 6%\nD. 12%\nAnswer:", " 12%"], ["Question: What was GDP per capita in the United States in 1850 when adjusting for inflation and PPP in 2011 prices?\nChoices:\nA. About $300\nB. About $3k\nC. About $8k\nD. About $15k\nAnswer:", " About $300"], ["Question: What was GDP per capita in the United States in 1850 when adjusting for inflation and PPP in 2011 prices?\nChoices:\nA. About $300\nB. About $3k\nC. About $8k\nD. About $15k\nAnswer:", " About $3k"], ["Question: What was GDP per capita in the United States in 1850 when adjusting for inflation and PPP in 2011 prices?\nChoices:\nA. About $300\nB. About $3k\nC. About $8k\nD. About $15k\nAnswer:", " About $8k"], ["Question: What was GDP per capita in the United States in 1850 when adjusting for inflation and PPP in 2011 prices?\nChoices:\nA. About $300\nB. About $3k\nC. About $8k\nD. About $15k\nAnswer:", " About $15k"], ["Question: The percentage of children in Ethiopia (age 8) who reported physical punishment by teachers in the past week in 2009 was about what?\nChoices:\nA. 18%\nB. 38%\nC. 58%\nD. 78%\nAnswer:", " 18%"], ["Question: The percentage of children in Ethiopia (age 8) who reported physical punishment by teachers in the past week in 2009 was about what?\nChoices:\nA. 18%\nB. 38%\nC. 58%\nD. 78%\nAnswer:", " 38%"], ["Question: The percentage of children in Ethiopia (age 8) who reported physical punishment by teachers in the past week in 2009 was about what?\nChoices:\nA. 18%\nB. 38%\nC. 58%\nD. 78%\nAnswer:", " 58%"], ["Question: The percentage of children in Ethiopia (age 8) who reported physical punishment by teachers in the past week in 2009 was about what?\nChoices:\nA. 18%\nB. 38%\nC. 58%\nD. 78%\nAnswer:", " 78%"], ["Question: As of 2016, about what percentage of adults aged 18 years or older were obese?\nChoices:\nA. 6%\nB. 13%\nC. 27%\nD. 46%\nAnswer:", " 6%"], ["Question: As of 2016, about what percentage of adults aged 18 years or older were obese?\nChoices:\nA. 6%\nB. 13%\nC. 27%\nD. 46%\nAnswer:", " 13%"], ["Question: As of 2016, about what percentage of adults aged 18 years or older were obese?\nChoices:\nA. 6%\nB. 13%\nC. 27%\nD. 46%\nAnswer:", " 27%"], ["Question: As of 2016, about what percentage of adults aged 18 years or older were obese?\nChoices:\nA. 6%\nB. 13%\nC. 27%\nD. 46%\nAnswer:", " 46%"], ["Question: Overall, the growth rate of average incomes in less developed countries between 1960 and 1995\nChoices:\nA. was approximately zero\nB. exceeded that of high income countries\nC. exceeded that of Britain during the industrial revolution\nD. was approximately 3.0% per year\nAnswer:", " was approximately zero"], ["Question: Overall, the growth rate of average incomes in less developed countries between 1960 and 1995\nChoices:\nA. was approximately zero\nB. exceeded that of high income countries\nC. exceeded that of Britain during the industrial revolution\nD. was approximately 3.0% per year\nAnswer:", " exceeded that of high income countries"], ["Question: Overall, the growth rate of average incomes in less developed countries between 1960 and 1995\nChoices:\nA. was approximately zero\nB. exceeded that of high income countries\nC. exceeded that of Britain during the industrial revolution\nD. was approximately 3.0% per year\nAnswer:", " exceeded that of Britain during the industrial revolution"], ["Question: Overall, the growth rate of average incomes in less developed countries between 1960 and 1995\nChoices:\nA. was approximately zero\nB. exceeded that of high income countries\nC. exceeded that of Britain during the industrial revolution\nD. was approximately 3.0% per year\nAnswer:", " was approximately 3.0% per year"]]
\ No newline at end of file
{"results": {"hendrycksTest-global_facts": {"acc": 0.1, "acc_stderr": 0.09999999999999999, "acc_norm": 0.2, "acc_norm_stderr": 0.13333333333333333}}, "versions": {"hendrycksTest-global_facts": 0}}
\ No newline at end of file
{"results": {"hendrycksTest-global_facts": {"acc": 0.23, "acc_norm": 0.23, "acc_norm_stderr": 0.04229525846816507, "acc_stderr": 0.04229525846816507}}, "versions": {"hendrycksTest-global_facts": 0}}
\ No newline at end of file
d4dc051f37a49dc75c218741e87bc826fd44f31ee1309b55e0f33bd191c1bc78
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment