Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
109ed1c7
Commit
109ed1c7
authored
Oct 17, 2023
by
lintangsutawika
Browse files
added subgroups for other mmlu variants
parent
93a45962
Changes
238
Expand all
Show whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
830 additions
and
1078 deletions
+830
-1078
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_formal_logic.yaml
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_formal_logic.yaml
+51
-55
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_global_facts.yaml
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_global_facts.yaml
+33
-32
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_biology.yaml
...tasks/mmlu/flan_cot_fewshot/mmlu_high_school_biology.yaml
+47
-47
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_chemistry.yaml
...sks/mmlu/flan_cot_fewshot/mmlu_high_school_chemistry.yaml
+5
-4
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_computer_science.yaml
...u/flan_cot_fewshot/mmlu_high_school_computer_science.yaml
+8
-7
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_european_history.yaml
...u/flan_cot_fewshot/mmlu_high_school_european_history.yaml
+23
-22
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_geography.yaml
...sks/mmlu/flan_cot_fewshot/mmlu_high_school_geography.yaml
+37
-63
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_government_and_politics.yaml
...cot_fewshot/mmlu_high_school_government_and_politics.yaml
+43
-67
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_macroeconomics.yaml
...mlu/flan_cot_fewshot/mmlu_high_school_macroeconomics.yaml
+37
-64
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_mathematics.yaml
...s/mmlu/flan_cot_fewshot/mmlu_high_school_mathematics.yaml
+6
-5
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_microeconomics.yaml
...mlu/flan_cot_fewshot/mmlu_high_school_microeconomics.yaml
+40
-63
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_physics.yaml
...tasks/mmlu/flan_cot_fewshot/mmlu_high_school_physics.yaml
+33
-32
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_psychology.yaml
...ks/mmlu/flan_cot_fewshot/mmlu_high_school_psychology.yaml
+48
-72
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_statistics.yaml
...ks/mmlu/flan_cot_fewshot/mmlu_high_school_statistics.yaml
+58
-88
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_us_history.yaml
...ks/mmlu/flan_cot_fewshot/mmlu_high_school_us_history.yaml
+126
-125
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_world_history.yaml
...mmlu/flan_cot_fewshot/mmlu_high_school_world_history.yaml
+74
-74
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_human_aging.yaml
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_human_aging.yaml
+28
-48
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_human_sexuality.yaml
...val/tasks/mmlu/flan_cot_fewshot/mmlu_human_sexuality.yaml
+34
-61
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_international_law.yaml
...l/tasks/mmlu/flan_cot_fewshot/mmlu_international_law.yaml
+54
-80
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_jurisprudence.yaml
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_jurisprudence.yaml
+45
-69
No files found.
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_formal_logic.yaml
View file @
109ed1c7
dataset_name
:
formal_logic
"
dataset_name
"
:
"
formal_logic
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
formal
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
formal
\
\
logic.
\n\n
Q:
Which
of
the
given
formulas
of
PL
is
the
best
symbolization
of
the
\
\
logic.
\n\n
Q:
Which
of
the
given
formulas
of
PL
is
the
best
symbolization
of
the
\
\
following
sentence?
\n
Turtles
live
long
lives
and
are
happy
creatures,
unless
they
\
\
following
sentence?
\n
Turtles
live
long
lives
and
are
happy
creatures,
unless
they
\
\
are
injured.
\n
(A)
(L
\u2022
H)
\u2261
I
(B)
(L
\u2022
H)
\u2228
I
(C)
L
\u2022\
\
are
injured.
\n
(A)
(L
•
H)
≡
I
(B)
(L
•
H)
∨
I
(C)
L
•
(H
∨
I)
(D)
L
•
(H
⊃
R).
\n\
\
(H
\u2228
I)
(D)
L
\u2022
(H
\u2283
R).
\n
A:
Let's
think
step
by
step.
We
refer
\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
formal
logic
for
\
\
to
Wikipedia
articles
on
formal
logic
for
help.
Let
\u2019
s
solve
this
step
by
\
\
help.
Let’s
solve
this
step
by
step.
Let
“L”
denote
“living
long”,
H
“being
happy”,
\
\
step.
Let
\u201C
L
\u201D
denote
\u201C
living
long
\u201D
,
H
\u201C
being
happy
\u201D\
\
and
“I”
“being
injured”.
Now,
consider
each
choice:
\n
(A)
means
(living
long
AND
\
,
and
\u201C
I
\u201D
\u201C
being
injured
\u201D
.
Now,
consider
each
choice:
\n
(A)
means
\
\
being
happy)
is
equivalent
to
(being
injured).
\n
(B)
means
(living
long
AND
being
\
\
(living
long
AND
being
happy)
is
equivalent
to
(being
injured).
\n
(B)
means
(living
\
\
happy)
OR
(being
injured).
\n
(C)
means
(living
long)
AND
(being
happy
OR
being
\
\
long
AND
being
happy)
OR
(being
injured).
\n
(C)
means
(living
long)
AND
(being
\
\
injured).
\n
(D)
means
(living
long)
AND
(being
happy
implies
being
R),
but
what
\
\
happy
OR
being
injured).
\n
(D)
means
(living
long)
AND
(being
happy
implies
being
\
\
R
denotes
is
not
clear.
\n
Obviously,
(B)
is
the
best
symbolization
of
the
original
\
\
R),
but
what
R
denotes
is
not
clear.
\n
Obviously,
(B)
is
the
best
symbolization
\
\
sentence.
The
answer
is
(B).
\n\n
Q:
Select
the
best
translation
into
predicate
\
\
of
the
original
sentence.
The
answer
is
(B).
\n\n
Q:
Select
the
best
translation
\
\
logic.George
borrows
Hector's
lawnmower.
(g:
George;
h:
Hector;
l:
Hector's
lawnmower;
\
\
into
predicate
logic.George
borrows
Hector's
lawnmower.
(g:
George;
h:
Hector;
\
\
Bxyx:
x
borrows
y
from
z).
\n
(A)
Blgh
(B)
Bhlg
(C)
Bglh
(D)
Bghl
\n
A:
Let's
think
\
\
l:
Hector's
lawnmower;
Bxyx:
x
borrows
y
from
z).
\n
(A)
Blgh
(B)
Bhlg
(C)
Bglh
\
\
step
by
step.
We
refer
to
Wikipedia
articles
on
formal
logic
for
help.
Let’s
solve
\
\
(D)
Bghl
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
formal
\
\
this
step
by
step.
We
are
told
that
“Bxyx”
means
“x
borrows
y
from
z”.
We
can
\
\
logic
for
help.
Let
\u2019
s
solve
this
step
by
step.
We
are
told
that
\u201C
Bxyx
\u201D\
\
rewrite
“George
borrows
Hector's
lawnmower”
as
“George
borrows
a
lawnmower
from
\
\
means
\u201C
x
borrows
y
from
z
\u201D
.
We
can
rewrite
\u201C
George
borrows
Hector's
\
\
Hector”,
which
can
then
be
translated
into
predicate
logic
as
“Bglh”.
The
answer
\
\
lawnmower
\u201D
as
\u201C
George
borrows
a
lawnmower
from
Hector
\u201D
,
which
can
\
\
“Bglh”
appears
in
(C);
therefore,
(C)
must
be
the
correct
answer.
The
answer
is
\
\
then
be
translated
into
predicate
logic
as
\u201C
Bglh
\u201D
.
The
answer
\u201C\
\
(C).
\n\n
Q:
\n
Select
the
best
English
interpretation
of
the
given
arguments
in
\
Bglh
\u201D
appears
in
(C);
therefore,
(C)
must
be
the
correct
answer.
The
answer
\
\
predicate
logic.
\n
Dm
\n
(∀x)(Wx
⊃
~Dx).
\n
(∀x)Wx
∨
Ag
\t
/
(∃x)Ax
\n
(A)
Marina
is
a
\
\
is
(C).
\n\n
Q:
\n
Select
the
best
English
interpretation
of
the
given
arguments
\
\
dancer.
Some
weaklings
are
not
dancers.
Either
everything
is
a
weakling
or
Georgia
\
\
in
predicate
logic.
\n
Dm
\n
(
\u2200
x)(Wx
\u2283
~Dx).
\n
(
\u2200
x)Wx
\u2228
Ag
\t
/
\
\
plays
volleyball.
So
something
plays
volleyball.
(B)
Marina
is
a
dancer.
No
weakling
\
\
(
\u2203
x)Ax
\n
(A)
Marina
is
a
dancer.
Some
weaklings
are
not
dancers.
Either
everything
\
\
is
a
dancer.
Everything
is
either
a
weakling
or
plays
volleyball.
So
something
\
\
is
a
weakling
or
Georgia
plays
volleyball.
So
something
plays
volleyball.
(B)
\
\
plays
volleyball.
(C)
Marina
is
a
dancer.
Some
weaklings
are
not
dancers.
Everything
\
\
Marina
is
a
dancer.
No
weakling
is
a
dancer.
Everything
is
either
a
weakling
or
\
\
is
either
a
weakling
or
plays
volleyball.
So
something
plays
volleyball.
(D)
Marina
\
\
plays
volleyball.
So
something
plays
volleyball.
(C)
Marina
is
a
dancer.
Some
\
\
is
a
dancer.
No
weakling
is
a
dancer.
Either
everything
is
a
weakling
or
Georgia
\
\
weaklings
are
not
dancers.
Everything
is
either
a
weakling
or
plays
volleyball.
\
\
plays
volleyball.
So
something
plays
volleyball.
\n
A:
Let's
think
step
by
step.
\
\
So
something
plays
volleyball.
(D)
Marina
is
a
dancer.
No
weakling
is
a
dancer.
\
\
We
refer
to
Wikipedia
articles
on
formal
logic
for
help.
Let’s
solve
this
step
\
\
Either
everything
is
a
weakling
or
Georgia
plays
volleyball.
So
something
plays
\
\
by
step.
Let
“D”
denote
“being
a
dancer”,
“m”
denote
“Maria”,
“g”
denote
“Georgia”,
\
\
volleyball.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
formal
\
\
“W”
denote
“weakling”,
“A”
denote
“playing
volleyball”.
Then,
we
have
the
following:
\n\
\
logic
for
help.
Let
\u2019
s
solve
this
step
by
step.
Let
\u201C
D
\u201D
denote
\u201C\
1.
Dm
→
Maria
is
a
dance.
\n
2.
(∀x)(Wx
⊃
~Dx).
→
For
all
x,
if
x
is
a
weakling,
then
\
being
a
dancer
\u201D
,
\u201C
m
\u201D
denote
\u201C
Maria
\u201D
,
\u201C
g
\u201D
denote
\
\
x
is
not
a
dancer.
In
other
words,
no
weakling
is
a
dancer.
\n
3.
(∀x)Wx
∨
Ag
\t\
\ \u201C
Georgia
\u201D
,
\u201C
W
\u201D
denote
\u201C
weakling
\u201D
,
\u201C
A
\u201D\
/
(∃x)Ax
→
For
all
x,
x
is
a
weakling
or
Georgia
plays
volleyball.
So
there
exists
\
\
denote
\u201C
playing
volleyball
\u201D
.
Then,
we
have
the
following:
\n
1.
Dm
\u2192\
\
an
x
that
plays
volleyball.
\n
Options
(A)
and
(C)
do
claim
that
some
weaklings
\
\
Maria
is
a
dance.
\n
2.
(
\u2200
x)(Wx
\u2283
~Dx).
\u2192
For
all
x,
if
x
is
a
weakling,
\
\
are
not
dancers,
but
the
second
argument
strongly
states
that
no
weakling
is
a
\
\
then
x
is
not
a
dancer.
In
other
words,
no
weakling
is
a
dancer.
\n
3.
(
\u2200
x)Wx
\
\
dancer.
Thus,
we
can
eliminate
them.
Option
(B)
omits
the
important
detail
about
\
\ \u2228
Ag
\t
/
(
\u2203
x)Ax
\u2192
For
all
x,
x
is
a
weakling
or
Georgia
plays
volleyball.
\
\
Georgia
playing
volleyball.
Option
(D)
has
all
the
details
presented
in
the
arguments
\
\
So
there
exists
an
x
that
plays
volleyball.
\n
Options
(A)
and
(C)
do
claim
that
\
\
and
is
the
best
English
interpretation
of
the
arguments.
The
answer
is
(D).
\n\n\
\
some
weaklings
are
not
dancers,
but
the
second
argument
strongly
states
that
no
\
Q:
Select
the
best
translation
into
predicate
logic:
No
people
drive
on
Mars.
\n\
\
weakling
is
a
dancer.
Thus,
we
can
eliminate
them.
Option
(B)
omits
the
important
\
(A)
~Pd
(B)
(∀x)(Px
∨
~Dx)
(C)
(∀x)(Px
⊃
~Dx)
(D)
~Dp
\n
A:
Let's
think
step
by
step.
\
\
detail
about
Georgia
playing
volleyball.
Option
(D)
has
all
the
details
presented
\
\
We
refer
to
Wikipedia
articles
on
formal
logic
for
help.
Let’s
solve
this
step
\
\
in
the
arguments
and
is
the
best
English
interpretation
of
the
arguments.
The
\
\
by
step.
Let
“P”
denote
“being
on
Mars”
and
“D”
denote
“driving
on
Mars”.
Then
\
\
answer
is
(D).
\n\n
Q:
Select
the
best
translation
into
predicate
logic:
No
people
\
\
let’s
consider
each
option:
\n
Option
(A):
~Pd
→
d
is
not
on
Mars.
\n
Option
(B):
\
\
drive
on
Mars.
\n
(A)
~Pd
(B)
(
\u2200
x)(Px
\u2228
~Dx)
(C)
(
\u2200
x)(Px
\u2283
~Dx)
\
\
(∀x)(Px
∨
~Dx)
→
For
all
x,
x
is
on
Mars
and
x
do
not
drive
on
Mars.
\n
Option
(C):
\
\
(D)
~Dp
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
formal
\
\
(∀x)(Px
⊃
~Dx)
→
For
all
x,
x
is
on
Mars
implies
that
x
do
not
drive
on
Mars.
\n\
\
logic
for
help.
Let
\u2019
s
solve
this
step
by
step.
Let
\u201C
P
\u201D
denote
\u201C\
Option
(D):
~Dp:
→
p
do
not
drive
on
Mars.
\n
Of
all
these
options,
Option
(C)
appears
\
being
on
Mars
\u201D
and
\u201C
D
\u201D
denote
\u201C
driving
on
Mars
\u201D
.
Then
let
\u2019\
\
to
be
the
best
and
most
meaningful
interpretation
of
the
argument
“No
people
drive
\
s
consider
each
option:
\n
Option
(A):
~Pd
\u2192
d
is
not
on
Mars.
\n
Option
(B):
(
\u2200\
\
on
Mars.”
The
answer
is
(C)."
x)(Px
\u2228
~Dx)
\u2192
For
all
x,
x
is
on
Mars
and
x
do
not
drive
on
Mars.
\n
Option
\
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
\
(C):
(
\u2200
x)(Px
\u2283
~Dx)
\u2192
For
all
x,
x
is
on
Mars
implies
that
x
do
\
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
\
not
drive
on
Mars.
\n
Option
(D):
~Dp:
\u2192
p
do
not
drive
on
Mars.
\n
Of
all
these
\
"
task"
:
"
mmlu_flan_cot_fewshot_formal_logic"
\
options,
Option
(C)
appears
to
be
the
best
and
most
meaningful
interpretation
\
\
of
the
argument
\u201C
No
people
drive
on
Mars.
\u201D
The
answer
is
(C)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_formal_logic
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_global_facts.yaml
View file @
109ed1c7
dataset_name
:
global_facts
"
dataset_name"
:
"
global_facts"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
global
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
global
\
\
facts.
\n\n
Q:
As
of
2017,
how
many
of
the
world
\u2019
s
1-year-old
children
today
\
\
facts.
\n\n
Q:
As
of
2017,
how
many
of
the
world’s
1-year-old
children
today
have
\
\
have
been
vaccinated
against
some
disease?
*
\n
(A)
80%
(B)
60%
(C)
40%
(D)
20%
\n\
\
been
vaccinated
against
some
disease?
*
\n
(A)
80%
(B)
60%
(C)
40%
(D)
20%
\n
A:
Let's
\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
\
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
According
\
\
help.
According
to
data
published
by
the
World
Health
Organization,
the
nummber
\
\
to
data
published
by
the
World
Health
Organization,
the
nummber
of
1-year-old
\
\
of
1-year-old
children
vaccinated
in
2017
exceeds
80%.
The
answer
is
(A).
\n\n\
\
children
vaccinated
in
2017
exceeds
80%.
The
answer
is
(A).
\n\n
Q:
As
of
2019,
\
Q:
As
of
2019,
about
what
percentage
of
Americans
agree
that
the
state
is
run
for
\
\
about
what
percentage
of
Americans
agree
that
the
state
is
run
for
the
benefit
\
\
the
benefit
of
all
the
people?
\n
(A)
31%
(B)
46%
(C)
61%
(D)
76%
\n
A:
Let's
think
\
\
of
all
the
people?
\n
(A)
31%
(B)
46%
(C)
61%
(D)
76%
\n
A:
Let's
think
step
by
step.
\
\
step
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
In
2019,
\
\
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
In
2019,
about
46%
percentage
\
\
about
46%
percentage
of
Americans
agree
that
the
state
is
run
for
the
benefit
\
\
of
Americans
agree
that
the
state
is
run
for
the
benefit
of
all
the
people.
The
\
\
of
all
the
people.
The
answer
is
(B).
\n\n
Q:
As
of
2019,
about
what
percentage
\
\
answer
is
(B).
\n\n
Q:
As
of
2019,
about
what
percentage
of
Russians
say
it
is
very
\
\
of
Russians
say
it
is
very
important
to
have
free
media
in
our
country
without
\
\
important
to
have
free
media
in
our
country
without
government/state
censorship?
\n\
\
government/state
censorship?
\n
(A)
38%
(B)
53%
(C)
68%
(D)
83%
\n
A:
Let's
think
\
(A)
38%
(B)
53%
(C)
68%
(D)
83%
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
step
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
As
of
2019,
\
\
articles
on
global
facts
for
help.
As
of
2019,
about
38%
of
Russians
say
it
is
\
\
about
38%
of
Russians
say
it
is
very
important
to
have
free
media
in
our
country.
\
\
very
important
to
have
free
media
in
our
country.
The
answer
is
(A).
\n\n
Q:
As
\
\
The
answer
is
(A).
\n\n
Q:
As
of
2015,
since
1990
forests
have
____
in
Europe
and
\
\
of
2015,
since
1990
forests
have
____
in
Europe
and
have
____
in
Africa
and
the
\
\
have
____
in
Africa
and
the
Americas.
\n
(A)
increased,
increased
(B)
increased,
\
\
Americas.
\n
(A)
increased,
increased
(B)
increased,
decreased
(C)
decreased,
increased
\
\
decreased
(C)
decreased,
increased
(D)
decreased,
decreased
\n
A:
Let's
think
step
\
\
(D)
decreased,
decreased
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
\
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
As
of
2015,
\
\
on
global
facts
for
help.
As
of
2015,
since
1990
forests
have
increased
in
Europe
\
\
since
1990
forests
have
increased
in
Europe
and
have
decreased
in
Africa
and
the
\
\
and
have
decreased
in
Africa
and
the
Americas.
The
answer
is
(B).
\n\n
Q:
Which
\
\
Americas.
The
answer
is
(B).
\n\n
Q:
Which
of
the
following
pairs
of
statements
\
\
of
the
following
pairs
of
statements
are
both
true
(as
of
2019)?
\n
(A)
People
tend
\
\
are
both
true
(as
of
2019)?
\n
(A)
People
tend
to
be
optimistic
about
their
own
\
\
to
be
optimistic
about
their
own
future
and
the
future
of
their
nation
or
the
\
\
future
and
the
future
of
their
nation
or
the
world.
(B)
People
tend
to
be
optimistic
\
\
world.
(B)
People
tend
to
be
optimistic
about
their
own
future
but
pessimistic
\
\
about
the
future
of
their
nation
or
the
world.
(C)
People
tend
to
be
pessimistic
\
\
about
their
own
future
but
optimistic
about
the
future
of
their
nation
or
the
\
\
world.
(D)
People
tend
to
be
pessimistic
about
their
own
future
and
the
future
\
\
of
their
nation
or
the
world.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
global
facts
for
help.
As
of
2019,
most
people
tend
to
be
optimistic
\
\
about
their
own
future
but
pessimistic
about
the
future
of
their
nation
or
the
\
\
about
their
own
future
but
pessimistic
about
the
future
of
their
nation
or
the
\
\
world.
(C)
People
tend
to
be
pessimistic
about
their
own
future
but
optimistic
\
\
world.
The
answer
is
(B)."
\
about
the
future
of
their
nation
or
the
world.
(D)
People
tend
to
be
pessimistic
\
"
group"
:
"
mmlu_flan_cot_fewshot_other"
\
about
their
own
future
and
the
future
of
their
nation
or
the
world.
\n
A:
Let's
\
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
global
facts
for
help.
As
\
"
task"
:
"
mmlu_flan_cot_fewshot_global_facts"
\
of
2019,
most
people
tend
to
be
optimistic
about
their
own
future
but
pessimistic
\
\
about
the
future
of
their
nation
or
the
world.
The
answer
is
(B)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_global_facts
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_biology.yaml
View file @
109ed1c7
dataset_name
:
high_school_biology
"
dataset_name
"
:
"
high_school_biology
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
biology.
\n\n
Q:
In
animal
cells,
which
of
the
following
represents
the
most
\
\
school
biology.
\n\n
Q:
In
animal
cells,
which
of
the
following
represents
the
most
\
\
likely
pathway
that
a
secretory
protein
takes
as
it
is
synthesized
in
a
cell?
\n\
\
likely
pathway
that
a
secretory
protein
takes
as
it
is
synthesized
in
a
cell?
\n\
(A)
Plasma
membrane
\u2013
Golgi
apparatus
\u2013
ribosome
\u2013
secretory
vesicle
\u2013\
(A)
Plasma
membrane–Golgi
apparatus–ribosome–secretory
vesicle–rough
ER
(B)
Ribosome–Golgi
\
rough
ER
(B)
Ribosome
\u2013
Golgi
apparatus
\u2013
rough
ER
\u2013
secretory
vesicle
\u2013\
\
apparatus–rough
ER–secretory
vesicle–plasma
membrane
(C)
Plasma
membrane–Golgi
\
plasma
membrane
(C)
Plasma
membrane
\u2013
Golgi
apparatus
\u2013
ribosome
\u2013
secretory
\
\
apparatus–ribosome–secretory
vesicle–rough
ER
(D)
Ribosome–rough
ER–Golgi
apparatus–secretory
\
\
vesicle
\u2013
rough
ER
(D)
Ribosome
\u2013
rough
ER
\u2013
Golgi
apparatus
\u2013
secretory
\
\
vesicle–plasma
membrane
\n
A:
Let's
think
step
by
step.
Protein
synthesis
starts
\
\
vesicle
\u2013
plasma
membrane
\n
A:
Let's
think
step
by
step.
Protein
synthesis
starts
\
\
at
the
ribosome,
so
we
can
eliminate
(A)
and
(C).
The
ribosome
is
often
in
the
\
\
at
the
ribosome,
so
we
can
eliminate
(A)
and
(C).
The
ribosome
is
often
in
the
\
\
endoplasmic
reticulum
and
moves
from
there
to
the
Golgi
apparatus,
where
it
is
\
\
endoplasmic
reticulum
and
moves
from
there
to
the
Golgi
apparatus,
where
it
is
\
\
modified
and
packaged
into
a
vesicle.
The
vesicle
then
floats
to
the
plasma
membrane
\
\
modified
and
packaged
into
a
vesicle.
The
vesicle
then
floats
to
the
plasma
membrane
\
\
and
is
secreted.
The
answer
is
(D).
\n\n
Q:
A
mutation
in
a
bacterial
enzyme
changed
\
\
and
is
secreted.
The
answer
is
(D).
\n\n
Q:
A
mutation
in
a
bacterial
enzyme
changed
\
\
a
previously
polar
amino
acid
into
a
nonpolar
amino
acid.
This
amino
acid
was
\
\
a
previously
polar
amino
acid
into
a
nonpolar
amino
acid.
This
amino
acid
was
\
\
located
at
a
site
distant
from
the
enzyme
\u2019
s
active
site.
How
might
this
mutation
\
\
located
at
a
site
distant
from
the
enzyme’s
active
site.
How
might
this
mutation
\
\
alter
the
enzyme
\u2019
s
substrate
specificity?
\n
(A)
By
changing
the
enzyme
\u2019\
\
alter
the
enzyme’s
substrate
specificity?
\n
(A)
By
changing
the
enzyme’s
pH
optimum
\
s
pH
optimum
(B)
By
changing
the
enzyme
\u2019
s
location
in
the
cell
(C)
By
changing
\
\
(B)
By
changing
the
enzyme’s
location
in
the
cell
(C)
By
changing
the
shape
of
\
\
the
shape
of
the
protein
(D)
An
amino
acid
change
away
from
the
active
site
cannot
\
\
the
protein
(D)
An
amino
acid
change
away
from
the
active
site
cannot
alter
the
\
\
alter
the
enzyme
\u2019
s
substrate
specificity.
\n
A:
Let's
think
step
by
step.
A
\
\
enzyme’s
substrate
specificity.
\n
A:
Let's
think
step
by
step.
A
change
in
an
amino
\
\
change
in
an
amino
acid
leads
to
a
change
in
the
primary
structure
of
the
protein.
\
\
acid
leads
to
a
change
in
the
primary
structure
of
the
protein.
A
change
in
the
\
\
A
change
in
the
primary
structure
may
lead
to
a
change
in
the
secondary
and
the
\
\
primary
structure
may
lead
to
a
change
in
the
secondary
and
the
tertiary
structure
\
\
tertiary
structure
of
the
protein.
A
change
in
the
tertiary
structure
means
a
\
\
of
the
protein.
A
change
in
the
tertiary
structure
means
a
change
in
the
shape
\
\
change
in
the
shape
of
the
protein,
so
(C)
has
to
be
correct.
Since
the
change
\
\
of
the
protein,
so
(C)
has
to
be
correct.
Since
the
change
does
not
affect
the
\
\
does
not
affect
the
active
site
of
the
enzyme,
we
do
not
expect
the
activity
of
\
\
active
site
of
the
enzyme,
we
do
not
expect
the
activity
of
the
enzyme
to
be
affected.
\
\
the
enzyme
to
be
affected.
The
answer
is
(C).
\n\n
Q:
Which
of
the
following
is
\
\
The
answer
is
(C).
\n\n
Q:
Which
of
the
following
is
not
a
way
to
form
recombinant
\
\
not
a
way
to
form
recombinant
DNA?
\n
(A)
Translation
(B)
Conjugation
(C)
Specialized
\
\
DNA?
\n
(A)
Translation
(B)
Conjugation
(C)
Specialized
transduction
(D)
Transformation
\n\
\
transduction
(D)
Transformation
\n
A:
Let's
think
step
by
step.
The
introduction
\
A:
Let's
think
step
by
step.
The
introduction
of
foreign
DNA
or
RNA
into
bacteria
\
\
of
foreign
DNA
or
RNA
into
bacteria
or
eukaryotic
cells
is
a
common
technique
\
\
or
eukaryotic
cells
is
a
common
technique
in
molecular
biology
and
scientific
\
\
in
molecular
biology
and
scientific
research.
There
are
multiple
ways
foreign
\
\
research.
There
are
multiple
ways
foreign
DNA
can
be
introduced
into
cells
including
\
\
DNA
can
be
introduced
into
cells
including
transformation,
transduction,
conjugation,
\
\
transformation,
transduction,
conjugation,
and
transfection.
In
contrast,
(A)
\
\
and
transfection.
In
contrast,
(A)
is
not
a
way
to
form
DNA:
during
translation
\
\
is
not
a
way
to
form
DNA:
during
translation
the
ribosomes
synthesize
proteins
\
\
the
ribosomes
synthesize
proteins
from
RNA.
The
answer
is
(A).
\n\n
Q:
Homologous
\
\
from
RNA.
The
answer
is
(A).
\n\n
Q:
Homologous
structures
are
often
cited
as
evidence
\
\
structures
are
often
cited
as
evidence
for
the
process
of
natural
selection.
All
\
\
for
the
process
of
natural
selection.
All
of
the
following
are
examples
of
homologous
\
\
of
the
following
are
examples
of
homologous
structures
EXCEPT
\n
(A)
the
wings
of
\
\
structures
EXCEPT
\n
(A)
the
wings
of
a
bird
and
the
wings
of
a
bat
(B)
the
flippers
\
\
a
bird
and
the
wings
of
a
bat
(B)
the
flippers
of
a
whale
and
the
arms
of
a
man
\
\
of
a
whale
and
the
arms
of
a
man
(C)
the
pectoral
fins
of
a
porpoise
and
the
flippers
\
\
(C)
the
pectoral
fins
of
a
porpoise
and
the
flippers
of
a
seal
(D)
the
forelegs
\
\
of
a
seal
(D)
the
forelegs
of
an
insect
and
the
forelimbs
of
a
dog
\n
A:
Let's
think
\
\
of
an
insect
and
the
forelimbs
of
a
dog
\n
A:
Let's
think
step
by
step.
\u200B\u200B\
\
step
by
step.
Homologous
structures
are
similar
physical
features
in
organisms
\
Homologous
structures
are
similar
physical
features
in
organisms
that
share
a
common
\
\
that
share
a
common
ancestor
but
different
functions.
Comparisons
(B)
and
(C)
\
\
ancestor
\u200B\u200B
but
different
functions.
Comparisons
(B)
and
(C)
are
clearly
\
\
are
clearly
homologous
because
they
share
a
common
ancestor
and
the
structures
\
\
homologous
because
they
share
a
common
ancestor
and
the
structures
serve
different
\
\
serve
different
purposes.
Bat
wings
and
birg
wings
are
also
homologous,
while
\
\
purposes.
Bat
wings
and
birg
wings
are
also
homologous,
while
they
are
both
wings,
\
\
they
are
both
wings,
the
forelimbs
serve
different
purposes.
Insects
and
dogs
\
\
the
forelimbs
serve
different
purposes.
Insects
and
dogs
are
very
far
ancestors
\
\
are
very
far
ancestors
since
one
is
vertebrate
while
the
other
is
invertebrate
\
\
since
one
is
vertebrate
while
the
other
is
invertebrate
and
the
forelimbs
serve
\
\
and
the
forelimbs
serve
the
same
purpose,
so
they
are
not
homologous.
The
answer
\
\
the
same
purpose,
so
they
are
not
homologous.
The
answer
is
(D).
\n\n
Q:
Which
of
\
\
is
(D).
\n\n
Q:
Which
of
the
following
is
not
known
to
be
involved
in
the
control
\
\
the
following
is
not
known
to
be
involved
in
the
control
of
cell
division?
\n
(A)
\
\
of
cell
division?
\n
(A)
Cyclins
(B)
Protein
kinases
(C)
Checkpoints
(D)
Fibroblast
\
\
Cyclins
(B)
Protein
kinases
(C)
Checkpoints
(D)
Fibroblast
cells
\n
A:
Let's
think
\
\
cells
\n
A:
Let's
think
step
by
step.
Normal
cells
move
through
the
cell
cycle
in
\
\
step
by
step.
Normal
cells
move
through
the
cell
cycle
in
a
regulated
way.
At
\
\
a
regulated
way.
At
the
checkpoint
stage,
they
use
information
about
their
own
\
\
the
checkpoint
stage,
they
use
information
about
their
own
internal
state
and
\
\
internal
state
and
cues
from
the
environment
around
them
to
decide
whether
to
\
\
cues
from
the
environment
around
them
to
decide
whether
to
proceed
with
cell
division.
\
\
proceed
with
cell
division.
Cues
like
these
act
by
changing
the
activity
of
core
\
\
Cues
like
these
act
by
changing
the
activity
of
core
cell
cycle
regulators
inside
\
\
cell
cycle
regulators
inside
the
cell.
The
most
common
regulators
are
cyclins
\
\
the
cell.
The
most
common
regulators
are
cyclins
and
cyclin-dependent
kinases.
\
\
and
cyclin-dependent
kinases.
Fibroblast
cells
do
not
play
any
role
in
cell
division.
\
\
Fibroblast
cells
do
not
play
any
role
in
cell
division.
The
answer
is
(D)."
\
The
answer
is
(D)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
task
:
mmlu_flan_cot_fewshot_high_school_biology
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_biology"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_chemistry.yaml
View file @
109ed1c7
dataset_name
:
high_school_chemistry
"
dataset_name
"
:
"
high_school_chemistry
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
chemistry.
\n\n
Q:
Which
of
the
following
is
considered
an
acid
anhydride?
\n\
\
school
chemistry.
\n\n
Q:
Which
of
the
following
is
considered
an
acid
anhydride?
\n\
(A)
HCl
(B)
H2SO3
(C)
SO2
(D)
Al(NO3)3
\n
A:
Let's
think
step
by
step.
An
acid
anhydride
\
(A)
HCl
(B)
H2SO3
(C)
SO2
(D)
Al(NO3)3
\n
A:
Let's
think
step
by
step.
An
acid
anhydride
\
\
is
a
compound
that
is
derived
by
removing
water
from
an
acid.
The
chemical
formula
\
\
is
a
compound
that
is
derived
by
removing
water
from
an
acid.
The
chemical
formula
\
...
@@ -45,5 +45,6 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -45,5 +45,6 @@ description: "The following are multiple choice questions (with answers) about h
\
the
acetate
ion.
The
added
strong
acid,
Nitric
acid,
will
react
with
the
conjugate
\
\
the
acetate
ion.
The
added
strong
acid,
Nitric
acid,
will
react
with
the
conjugate
\
\
base.
Therefore
the
maximum
amount
of
acid
that
can
be
added
will
be
equal
to
\
\
base.
Therefore
the
maximum
amount
of
acid
that
can
be
added
will
be
equal
to
\
\
the
amount
of
acetate
ion,
or
2
moles.
The
answer
is
(C)."
\
the
amount
of
acetate
ion,
or
2
moles.
The
answer
is
(C)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
task
:
mmlu_flan_cot_fewshot_high_school_chemistry
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_chemistry"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_computer_science.yaml
View file @
109ed1c7
dataset_name
:
high_school_computer_science
"
dataset_name
"
:
"
high_school_computer_science
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
computer
science.
\n\n
Q:
Which
of
the
following
is
an
example
of
the
use
\
\
school
computer
science.
\n\n
Q:
Which
of
the
following
is
an
example
of
the
use
\
\
of
a
device
on
the
Internet
of
Things
(IoT)
?
\n
(A)
A
car
alerts
a
driver
that
\
\
of
a
device
on
the
Internet
of
Things
(IoT)
?
\n
(A)
A
car
alerts
a
driver
that
\
\
it
is
about
to
hit
an
object.
(B)
A
hiker
uses
a
G
P
S
watch
to
keep
track
of
\
\
it
is
about
to
hit
an
object.
(B)
A
hiker
uses
a
G
P
S
watch
to
keep
track
of
\
...
@@ -26,9 +26,9 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -26,9 +26,9 @@ description: "The following are multiple choice questions (with answers) about h
\
launched
from
any
web
sites
visited
or
files
downloaded.
\n
A:
Let's
think
step
\
\
launched
from
any
web
sites
visited
or
files
downloaded.
\n
A:
Let's
think
step
\
\
by
step.
Choice
A
is
incorrect
as
it
only
describes
network
traffic,
which
an
\
\
by
step.
Choice
A
is
incorrect
as
it
only
describes
network
traffic,
which
an
\
\
anonymous
browser
does
not
change.
Choice
B
is
correct
as
it
correctly
describes
\
\
anonymous
browser
does
not
change.
Choice
B
is
correct
as
it
correctly
describes
\
\
how
an
anonymous
browser
will
prevent
saving
data
on
the
user
\u2019
s
computer
\
\
how
an
anonymous
browser
will
prevent
saving
data
on
the
user
’
s
computer
after
\
\
after
the
session
is
ended.
Choice
C
is
incorrect
because
an
anonymous
browser
\
\
the
session
is
ended.
Choice
C
is
incorrect
because
an
anonymous
browser
will
\
\
will
not
prevent
logging
in
to
email
or
social
media
accounts.
Choice
D
is
incorrect
\
\
not
prevent
logging
in
to
email
or
social
media
accounts.
Choice
D
is
incorrect
\
\
because
an
anonymous
browser
in
itself
performs
no
virus
protection.
The
answer
\
\
because
an
anonymous
browser
in
itself
performs
no
virus
protection.
The
answer
\
\
is
(B).
\n\n
Q:
In
the
program
below,
the
initial
value
of
X
is
5
and
the
initial
\
\
is
(B).
\n\n
Q:
In
the
program
below,
the
initial
value
of
X
is
5
and
the
initial
\
\
value
of
Y
is
10.
\n
IF
(X
<
0){
\n
DISPLAY
(
\"
Foxtrot
\"
)
\n
}
ELSE
{
\n
IF
(X
>
Y){
\n\
\
value
of
Y
is
10.
\n
IF
(X
<
0){
\n
DISPLAY
(
\"
Foxtrot
\"
)
\n
}
ELSE
{
\n
IF
(X
>
Y){
\n\
...
@@ -66,5 +66,6 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -66,5 +66,6 @@ description: "The following are multiple choice questions (with answers) about h
\
its
value
is
greater
than
100,
regardless
of
the
elements
in
the
list.
Choice
\
\
its
value
is
greater
than
100,
regardless
of
the
elements
in
the
list.
Choice
\
\
D
is
incorrect
because
its
step
3
does
not
increment
the
value
of
position,
so
\
\
D
is
incorrect
because
its
step
3
does
not
increment
the
value
of
position,
so
\
\
it
will
repeat
forever.
The
answer
is
(B)."
\
it
will
repeat
forever.
The
answer
is
(B)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
task
:
mmlu_flan_cot_fewshot_high_school_computer_science
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_computer_science"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_european_history.yaml
View file @
109ed1c7
dataset_name
:
high_school_european_history
"
dataset_name
"
:
"
high_school_european_history
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
european
history.
\n\n
Q:
This
question
refers
to
the
following
information.
\n\
\
school
european
history.
\n\n
Q:
This
question
refers
to
the
following
information.
\n\
Albeit
the
king's
Majesty
justly
and
rightfully
is
and
ought
to
be
the
supreme
head
\
Albeit
the
king's
Majesty
justly
and
rightfully
is
and
ought
to
be
the
supreme
head
\
\
of
the
Church
of
England,
and
so
is
recognized
by
the
clergy
of
this
realm
in
\
\
of
the
Church
of
England,
and
so
is
recognized
by
the
clergy
of
this
realm
in
\
...
@@ -34,7 +34,7 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -34,7 +34,7 @@ description: "The following are multiple choice questions (with answers) about h
\
the
corruption
in
the
Church
of
England.
The
answer
is
(D).
\n\n
Q:
This
question
\
\
the
corruption
in
the
Church
of
England.
The
answer
is
(D).
\n\n
Q:
This
question
\
\
refers
to
the
following
information.
\n
Read
the
following
excerpt.
\n
The
revolutionary
\
\
refers
to
the
following
information.
\n
Read
the
following
excerpt.
\n
The
revolutionary
\
\
seed
had
penetrated
into
every
country
and
spread
more
or
less.
It
was
greatly
\
\
seed
had
penetrated
into
every
country
and
spread
more
or
less.
It
was
greatly
\
\
developed
under
the
r
\xE9
gime
of
the
military
despotism
of
Bonaparte.
His
conquests
\
\
developed
under
the
r
é
gime
of
the
military
despotism
of
Bonaparte.
His
conquests
\
\
displaced
a
number
of
laws,
institutions,
and
customs;
broke
through
bonds
sacred
\
\
displaced
a
number
of
laws,
institutions,
and
customs;
broke
through
bonds
sacred
\
\
among
all
nations,
strong
enough
to
resist
time
itself;
which
is
more
than
can
\
\
among
all
nations,
strong
enough
to
resist
time
itself;
which
is
more
than
can
\
\
be
said
of
certain
benefits
conferred
by
these
innovators.
\n
The
monarchs
will
\
\
be
said
of
certain
benefits
conferred
by
these
innovators.
\n
The
monarchs
will
\
...
@@ -55,9 +55,9 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -55,9 +55,9 @@ description: "The following are multiple choice questions (with answers) about h
Let
them
maintain
religious
principles
in
all
their
purity,
and
not
allow
the
faith
\
Let
them
maintain
religious
principles
in
all
their
purity,
and
not
allow
the
faith
\
\
to
be
attacked
and
morality
interpreted
according
to
the
social
contract
or
the
\
\
to
be
attacked
and
morality
interpreted
according
to
the
social
contract
or
the
\
\
visions
of
foolish
sectarians.
\n
Let
them
suppress
Secret
Societies;
that
gangrene
\
\
visions
of
foolish
sectarians.
\n
Let
them
suppress
Secret
Societies;
that
gangrene
\
\
of
society.
\n
\u2014
Klemens
von
Metternich,
Political
Confession
of
Faith,
1820
\n\
\
of
society.
\n
—
Klemens
von
Metternich,
Political
Confession
of
Faith,
1820
\n
Which
\
Which
of
the
following
was
the
greatest
cause
of
the
fears
expressed
by
Metternich
\
\
of
the
following
was
the
greatest
cause
of
the
fears
expressed
by
Metternich
in
\
\
in
the
document
above?
\n
(A)
The
ideas
of
personal
liberty
and
nationalism
conceived
\
\
the
document
above?
\n
(A)
The
ideas
of
personal
liberty
and
nationalism
conceived
\
\
during
the
Enlightenment
resulted
in
radical
revolutions
that
could
spread
throughout
\
\
during
the
Enlightenment
resulted
in
radical
revolutions
that
could
spread
throughout
\
\
Europe.
(B)
The
conquest
of
Europe
by
Napoleon
led
to
the
creation
of
new
factions
\
\
Europe.
(B)
The
conquest
of
Europe
by
Napoleon
led
to
the
creation
of
new
factions
\
\
and
shifted
the
European
balance
of
power.
(C)
The
power
of
monarchs
had
grown
\
\
and
shifted
the
European
balance
of
power.
(C)
The
power
of
monarchs
had
grown
\
...
@@ -110,15 +110,15 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -110,15 +110,15 @@ description: "The following are multiple choice questions (with answers) about h
\
were
all
turning
to
the
politicians;
the
famous
Nihilists
who
made
Europe
tremble-sons
\
\
were
all
turning
to
the
politicians;
the
famous
Nihilists
who
made
Europe
tremble-sons
\
\
of
village
priests,
of
the
lower
middle
class,
of
tradesmen-could
not
rise
above
\
\
of
village
priests,
of
the
lower
middle
class,
of
tradesmen-could
not
rise
above
\
\
the
idea
of
national
liberation,
and
seemed
to
believe
that
the
world
would
be
\
\
the
idea
of
national
liberation,
and
seemed
to
believe
that
the
world
would
be
\
\
delivered-when
they
had
killed
their
despot&
\u2026
\n\"
Foolery!
They'll
never
get
\
\
delivered-when
they
had
killed
their
despot&
…
\n\"
Foolery!
They'll
never
get
out
\
\
out
of
it
with
their
foolery.
\"\n
Then,
lowering
his
voice
still
more,
in
a
few
\
\
of
it
with
their
foolery.
\"\n
Then,
lowering
his
voice
still
more,
in
a
few
bitter
\
\
bitter
words
he
described
his
old
dream
of
fraternity.
He
had
renounced
his
rank
\
\
words
he
described
his
old
dream
of
fraternity.
He
had
renounced
his
rank
and
\
\
and
his
fortune;
he
had
gone
among
workmen,
only
in
the
hope
of
seeing
at
last
\
\
his
fortune;
he
had
gone
among
workmen,
only
in
the
hope
of
seeing
at
last
the
\
\
the
foundation
of
a
new
society
of
labour
in
common.
All
the
sous
in
his
pockets
\
\
foundation
of
a
new
society
of
labour
in
common.
All
the
sous
in
his
pockets
had
\
\
had
long
gone
to
the
urchins
of
the
settlement;
he
had
been
as
tender
as
a
brother
\
\
long
gone
to
the
urchins
of
the
settlement;
he
had
been
as
tender
as
a
brother
\
\
with
the
colliers,
smiling
at
their
suspicion,
winning
them
over
by
his
quiet
\
\
with
the
colliers,
smiling
at
their
suspicion,
winning
them
over
by
his
quiet
\
\
workmanlike
ways
and
his
dislike
of
chattering.
But
decidedly
the
fusion
had
not
\
\
workmanlike
ways
and
his
dislike
of
chattering.
But
decidedly
the
fusion
had
not
\
\
taken
place.
\n
His
voice
changed,
his
eyes
grew
bright,
he
fixed
them
on
\xE9
tienne,
\
\
taken
place.
\n
His
voice
changed,
his
eyes
grew
bright,
he
fixed
them
on
é
tienne,
\
\
directly
addressing
him:
\n\"
Now,
do
you
understand
that?
These
hatworkers
at
Marseilles
\
\
directly
addressing
him:
\n\"
Now,
do
you
understand
that?
These
hatworkers
at
Marseilles
\
\
who
have
won
the
great
lottery
prize
of
a
hundred
thousand
francs
have
gone
off
\
\
who
have
won
the
great
lottery
prize
of
a
hundred
thousand
francs
have
gone
off
\
\
at
once
and
invested
it,
declaring
that
they
are
going
to
live
without
doing
anything!
\
\
at
once
and
invested
it,
declaring
that
they
are
going
to
live
without
doing
anything!
\
...
@@ -127,7 +127,7 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -127,7 +127,7 @@ description: "The following are multiple choice questions (with answers) about h
\
out
as
much
as
you
like
against
the
rich,
you
haven't
got
courage
enough
to
give
\
\
out
as
much
as
you
like
against
the
rich,
you
haven't
got
courage
enough
to
give
\
\
back
to
the
poor
the
money
that
luck
brings
you.
You
will
never
be
worthy
of
happiness
\
\
back
to
the
poor
the
money
that
luck
brings
you.
You
will
never
be
worthy
of
happiness
\
\
as
long
as
you
own
anything,
and
your
hatred
of
the
bourgeois
proceeds
solely
\
\
as
long
as
you
own
anything,
and
your
hatred
of
the
bourgeois
proceeds
solely
\
\
from
an
angry
desire
to
be
bourgeois
yourselves
in
their
place.
\"\n
\xE9
mile
Zola,
\
\
from
an
angry
desire
to
be
bourgeois
yourselves
in
their
place.
\"\n
é
mile
Zola,
\
\
French
writer,
Germinal,
1885
\n
The
passage
displays
the
direct
concern
for
the
\
\
French
writer,
Germinal,
1885
\n
The
passage
displays
the
direct
concern
for
the
\
\
welfare
of
the
working
classes
that
was
typically
a
part
of
which
movement?
\n\
\
welfare
of
the
working
classes
that
was
typically
a
part
of
which
movement?
\n\
(A)
Capitalist
(B)
Scientific
(C)
Communist
(D)
Existentialist
\n
A:
Let's
think
step
\
(A)
Capitalist
(B)
Scientific
(C)
Communist
(D)
Existentialist
\n
A:
Let's
think
step
\
...
@@ -156,13 +156,14 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -156,13 +156,14 @@ description: "The following are multiple choice questions (with answers) about h
\
whether
Jewish,
Christian
or
Turkish,
appear
to
me
no
other
than
human
inventions,
\
\
whether
Jewish,
Christian
or
Turkish,
appear
to
me
no
other
than
human
inventions,
\
\
set
up
to
terrify
and
enslave
mankind,
and
monopolize
power
and
profit.
\n
I
do
\
\
set
up
to
terrify
and
enslave
mankind,
and
monopolize
power
and
profit.
\n
I
do
\
\
not
mean
by
this
declaration
to
condemn
those
who
believe
otherwise;
they
have
\
\
not
mean
by
this
declaration
to
condemn
those
who
believe
otherwise;
they
have
\
\
the
same
right
to
their
belief
as
I
have
to
mine.
\n
\u2014
Thomas
Paine,
The
Age
\
\
the
same
right
to
their
belief
as
I
have
to
mine.
\n
—
Thomas
Paine,
The
Age
of
Reason,
\
\
of
Reason,
1794
\u2013
1795
\n
Which
of
the
following
Enlightenment
philosophes
designed
\
\
1794–
1795
\n
Which
of
the
following
Enlightenment
philosophes
designed
a
system
\
\
a
system
of
checks
and
balances
for
government
to
avoid
abuses
of
power?
\n
(A)
\
\
of
checks
and
balances
for
government
to
avoid
abuses
of
power?
\n
(A)
Jean
Jacques
\
\
Jean
Jacques
Rousseau
(B)
Baron
Montesquieu
(C)
Mary
Wollstonecraft
(D)
Adam
Smith
\n\
\
Rousseau
(B)
Baron
Montesquieu
(C)
Mary
Wollstonecraft
(D)
Adam
Smith
\n
A:
Let's
\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
european
history
\
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
european
history
for
help.
\
\
for
help.
Baron
Montesquieu
was
a
18th
centrury
French
philsopher
who
wrote
extensively
\
\
Baron
Montesquieu
was
a
18th
centrury
French
philsopher
who
wrote
extensively
\
\
against
the
monoplization
of
power
and
advocated
for
a
system
of
checks
and
balances
\
\
against
the
monoplization
of
power
and
advocated
for
a
system
of
checks
and
balances
\
\
in
government
to
prevent
the
rise
of
despotism.
The
answer
is
(B)."
\
in
government
to
prevent
the
rise
of
despotism.
The
answer
is
(B)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
task
:
mmlu_flan_cot_fewshot_high_school_european_history
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_european_history"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_geography.yaml
View file @
109ed1c7
dataset_name
:
high_school_geography
"
dataset_name"
:
"
high_school_geography"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
geography.
\
school
geography.
\n\n
Q:
Which
one
of
the
following
items
is
an
example
of
nonmaterial
\
\
culture?
\n
(A)
Dove
soap
(B)
Dove
candy
bar
(C)
Dove
symbol
(D)
A
dove
(bird).
\n\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
\
Q:
Which
one
of
the
following
items
is
an
example
of
nonmaterial
culture?
\
Nonmaterial
culture
consists
of
cultural
ideas,
beliefs
or
symbols
that
are
not
\
\
physical
objects.
The
answer
is
(C).
\n\n
Q:
During
the
third
stage
of
the
demographic
\
(A)
Dove
soap
(B)
Dove
candy
bar
(C)
Dove
symbol
(D)
A
dove
(bird).
\
transition
model,
which
of
the
following
is
true?
\n
(A)
Birth
rates
increase
and
\
\
population
growth
rate
is
less
rapid.
(B)
Birth
rates
decline
and
population
growth
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
\
rate
is
less
rapid.
(C)
Birth
rates
increase
and
population
growth
rate
increases.
\
Nonmaterial
culture
consists
of
cultural
ideas,
beliefs
or
symbols
that
are
not
\
(D)
Birth
rates
decrease
and
population
growth
rate
increases.
\n
A:
Let's
think
\
physical
objects.
The
answer
is
(C).
\
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
The
demographic
\
\
transition
model
models
the
five
different
stages
of
population
growth
as
a
country
\
\
goes
through
economic
development,
where
the
third
stage
refers
to
a
period
of
\
Q:
During
the
third
stage
of
the
demographic
transition
model,
which
of
the
following
\
declining
birth
rates
and
lower
population
growth.
The
answer
is
(B).
\n\n
Q:
The
\
is
true?
\
practice
of
hiring
a
foreign
third-party
service
provider
to
run
an
operation
\
\
is
called
\n
(A)
outsourcing.
(B)
offshoring.
(C)
maquiladoras.
(D)
locational
interdependence.
\n\
(A)
Birth
rates
increase
and
population
growth
rate
is
less
rapid.
(B)
Birth
rates
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
\
decline
and
population
growth
rate
is
less
rapid.
(C)
Birth
rates
increase
and
population
\ \"
Offshoring
\"
literally
means
to
move
or
base
some
of
the
activities
or
processes
\
growth
rate
increases.
(D)
Birth
rates
decrease
and
population
growth
rate
increases.
\
of
a
company
to
a
foreign
country.
The
answer
is
(B).
\n\n
Q:
Which
of
the
following
\
\
statements
is
NOT
accurate
regarding
the
services
provided
by
local
governments
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
\
in
the
United
States?
\n
(A)
Duplication
of
efforts
occurs
often.
(B)
Social
problems
\
The
demographic
transition
model
models
the
five
different
stages
of
population
\
of
the
central
city
spill
over
into
the
surrounding
residential
suburbs.
(C)
Inefficiency
\
growth
as
a
country
goes
through
economic
development,
where
the
third
stage
refers
\
in
providing
services
occurs
often.
(D)
One
neighborhood's
efforts
to
reduce
pollution
\
to
a
period
of
declining
birth
rates
and
lower
population
growth.
The
answer
is
\
are
always
supported
by
neighboring
communities.
\n
A:
Let's
think
step
by
step.
\
(B).
\
We
refer
to
Wikipedia
articles
on
geography
for
help.
There
may
be
economic,
social
\
\
or
political
reasons
for
two
neighboring
communities
and
their
local
governments
\
\
not
agreeing
to
pollution
reduction
efforts
initiated
by
one
of
them.
The
answer
\
Q:
The
practice
of
hiring
a
foreign
third-party
service
provider
to
run
an
operation
\
is
(D).
\n\n
Q:
The
rate
of
natural
increase
of
a
population
is
found
by
subtracting
\
is
called
\
the
\n
(A)
crude
death
rate
from
the
crude
birth
date.
(B)
crude
birth
rate
from
\
\
the
crude
death
rate.
(C)
doubling
time
from
the
crude
birth
rate.
(D)
fertility
\
(A)
outsourcing.
(B)
offshoring.
(C)
maquiladoras.
(D)
locational
interdependence.
\
rate
from
the
crude
death
rate.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
geography
for
help.
The
difference
between
number
of
births
and
deaths
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
\
gives
the
population
increase
at
any
given
time.
The
answer
is
(A)."
"Offshoring"
literally
means
to
move
or
base
some
of
the
activities
or
processes
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
of
a
company
to
a
foreign
country.
The
answer
is
(B).
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_geography"
Q:
Which
of
the
following
statements
is
NOT
accurate
regarding
the
services
provided
by
local
governments
in
the
United
States?
(A)
Duplication
of
efforts
occurs
often.
(B)
Social
problems
of
the
central
city
spill
over
into
the
surrounding
residential
suburbs.
(C)
Inefficiency
in
providing
services
occurs
often.
(D)
One
neighborhood'
'
s
efforts
to
reduce
pollution
are
always
supported
by
neighboring
communities.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
There
may
be
economic,
social
or
political
reasons
for
two
neighboring
communities
and
their
local
governments
not
agreeing
to
pollution
reduction
efforts
initiated
by
one
of
them.
The
answer
is
(D).
Q:
The
rate
of
natural
increase
of
a
population
is
found
by
subtracting
the
(A)
crude
death
rate
from
the
crude
birth
date.
(B)
crude
birth
rate
from
the
crude
death
rate.
(C)
doubling
time
from
the
crude
birth
rate.
(D)
fertility
rate
from
the
crude
death
rate.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
geography
for
help.
The
difference
between
number
of
births
and
deaths
gives
the
population
increase
at
any
given
time.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_geography
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_government_and_politics.yaml
View file @
109ed1c7
dataset_name
:
high_school_government_and_politics
"
dataset_name"
:
"
high_school_government_and_politics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
government
and
politics.
\
school
government
and
politics.
\n\n
Q:
Which
of
the
following
best
states
an
argument
\
\
made
by
James
Madison
in
The
Federalist
number
10?
\n
(A)
Honest
politicians
can
\
\
prevent
factions
from
developing.
(B)
Factions
are
more
likely
to
occur
in
large
\
Q:
Which
of
the
following
best
states
an
argument
made
by
James
Madison
in
The
Federalist
\
republics
than
in
small
ones.
(C)
The
negative
effects
of
factionalism
can
be
\
number
10?
\
reduced
by
a
republican
government.
(D)
Free
elections
are
the
people's
best
defense
\
\
against
factionalism.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
(A)
Honest
politicians
can
prevent
factions
from
developing.
(B)
Factions
are
more
\
on
government
and
politics
for
help.
In
the
Federalist
number
10,
James
Madison
\
likely
to
occur
in
large
republics
than
in
small
ones.
(C)
The
negative
effects
\
advocated
for
a
representative
republican
form
of
government
to
guard
against
\
of
factionalism
can
be
reduced
by
a
republican
government.
(D)
Free
elections
are
\
factionalism.
The
answer
is
(C).
\n\n
Q:
The
term
\"
budget
deficit
\"
refers
to
the
\n\
the
people'
'
s
best
defense
against
factionalism.
(A)
annual
increase
in
federal
spending
on
the
military
(B)
amount
of
interest
on
\
\
the
national
debt
(C)
difference
between
the
initial
budget
proposals
made
by
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
\
the
president
and
Congress
(D)
amount
the
government
spends
in
excess
of
its
revenues
\n\
for
help.
In
the
Federalist
number
10,
James
Madison
advocated
for
a
representative
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
\
republican
form
of
government
to
guard
against
factionalism.
The
answer
is
(C).
\
for
help.
When
the
goverment
spends
more
than
it
earns,
their
difference
is
the
\
\
budget
deficit.
The
answer
is
(D).
\n\n
Q:
Which
of
the
following
statements
about
\
\
cabinet
departments
is
FALSE?
\n
(A)
They
are
established
by
the
legislative
branch.
\
Q:
The
term
"budget
deficit"
refers
to
the
\
(B)
Their
members
often
don't
have
much
influence
over
presidential
decisions.
\
\
(C)
They
cannot
all
be
run
by
leaders
who
belong
to
the
same
political
party
the
\
(A)
annual
increase
in
federal
spending
on
the
military
(B)
amount
of
interest
on
\
president
does.
(D)
Not
every
federal
agency
is
a
cabinet
department.
\n
A:
Let's
\
the
national
debt
(C)
difference
between
the
initial
budget
proposals
made
by
the
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
\
president
and
Congress
(D)
amount
the
government
spends
in
excess
of
its
revenues
\
for
help.
There
is
no
law
stipulating
that
some
cabinet
department
leaders
have
\
\
to
belong
to
a
political
party
different
from
that
of
the
president.
The
answer
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
\
is
(C).
\n\n
Q:
Which
of
the
following
cases
established
the
precedent
that
a
defendant
\
for
help.
When
the
goverment
spends
more
than
it
earns,
their
difference
is
the
\
must
be
informed
of
the
right
to
remain
silent,
the
right
to
a
lawyer,
and
protection
\
budget
deficit.
The
answer
is
(D).
\
from
self-incrimination?
\n
(A)
Weeks
v.
United
States
(B)
Betts
v.
Brady
(C)
Mapp
\
\
v.
Ohio
(D)
Miranda
v.
Arizona
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
government
and
politics
for
help.
In
the
landmark
Miranda
v.
Arizona
\
Q:
Which
of
the
following
statements
about
cabinet
departments
is
FALSE?
\
in
1966,
the
US
Supreme
Court,
based
on
the
Fifth
and
Sixth
Amendment
of
the
US
\
\
Constitution,
guaranteed
a
defendant's
right
to
an
attorney
and
protection
from
\
(A)
They
are
established
by
the
legislative
branch.
(B)
Their
members
often
don'
'
t
\
self-incrimination.
The
answer
is
(D).
\n\n
Q:
Uncertainty
over
the
limits
to
presidential
\
have
much
influence
over
presidential
decisions.
(C)
They
cannot
all
be
run
by
leaders
\
power
is
caused
primarily
by
the
fact
that
\n
(A)
the
constitutional
definition
\
who
belong
to
the
same
political
party
the
president
does.
(D)
Not
every
federal
\
of
those
powers
is
broad
and
unspecific
(B)
most
people
agree
that
the
Constitution
\
agency
is
a
cabinet
department.
\
places
too
many
limits
on
presidential
power
(C)
the
Supreme
Court
consistently
\
\
refuses
to
rule
on
cases
concerning
presidential
powers
(D)
constitutional
amendments
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
\
have
greatly
increased
presidential
powers
\n
A:
Let's
think
step
by
step.
We
refer
\
for
help.
There
is
no
law
stipulating
that
some
cabinet
department
leaders
have
\
to
Wikipedia
articles
on
government
and
politics
for
help.
The
US
Constitution
\
to
belong
to
a
political
party
different
from
that
of
the
president.
The
answer
\
is
not
very
specific
about
the
powers
of
the
president,
leading
to
uncertainty
\
is
(C).
\
over
its
limits.
The
answer
is
(A)."
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
Q:
Which
of
the
following
cases
established
the
precedent
that
a
defendant
must
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_government_and_politics"
be
informed
of
the
right
to
remain
silent,
the
right
to
a
lawyer,
and
protection
from
self-incrimination?
(A)
Weeks
v.
United
States
(B)
Betts
v.
Brady
(C)
Mapp
v.
Ohio
(D)
Miranda
v.
Arizona
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
for
help.
In
the
landmark
Miranda
v.
Arizona
in
1966,
the
US
Supreme
Court,
based
on
the
Fifth
and
Sixth
Amendment
of
the
US
Constitution,
guaranteed
a
defendant'
'
s
right
to
an
attorney
and
protection
from
self-incrimination.
The
answer
is
(D).
Q:
Uncertainty
over
the
limits
to
presidential
power
is
caused
primarily
by
the
fact
that
(A)
the
constitutional
definition
of
those
powers
is
broad
and
unspecific
(B)
most
people
agree
that
the
Constitution
places
too
many
limits
on
presidential
power
(C)
the
Supreme
Court
consistently
refuses
to
rule
on
cases
concerning
presidential
powers
(D)
constitutional
amendments
have
greatly
increased
presidential
powers
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
government
and
politics
for
help.
The
US
Constitution
is
not
very
specific
about
the
powers
of
the
president,
leading
to
uncertainty
over
its
limits.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_government_and_politics
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_macroeconomics.yaml
View file @
109ed1c7
dataset_name
:
high_school_macroeconomics
"
dataset_name"
:
"
high_school_macroeconomics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
macroeconomics.
\
school
macroeconomics.
\n\n
Q:
Which
of
the
following
policies
best
describes
supply-side
\
\
fiscal
policy?
\n
(A)
An
increase
in
the
money
supply
(B)
Increased
government
spending
\
\
(C)
Lower
taxes
on
research
and
development
of
new
technology
(D)
Higher
taxes
\
Q:
Which
of
the
following
policies
best
describes
supply-side
fiscal
policy?
\
on
household
income
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
\
on
macroeconomics
for
help.
Supply-side
fiscal
policy
stimulates
the
economy
by
\
(A)
An
increase
in
the
money
supply
(B)
Increased
government
spending
(C)
Lower
\
encouraging
more
production
of
goods
and
services
through
reduction
in
taxes
and
\
taxes
on
research
and
development
of
new
technology
(D)
Higher
taxes
on
household
\
deregulation.
The
answer
is
(C).
\n\n
Q:
The
short-run
Phillips
curve
indicates
\
income
\
a
\n
(A)
direct
relation
between
unemployment
and
inflation
(B)
direct
relation
\
\
between
price
and
quantity
demanded
(C)
inverse
relation
between
price
and
quantity
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
\
demanded
(D)
inverse
relation
between
unemployment
and
inflation
\n
A:
Let's
think
\
help.
Supply-side
fiscal
policy
stimulates
the
economy
by
encouraging
more
production
\
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
help.
The
short-run
\
of
goods
and
services
through
reduction
in
taxes
and
deregulation.
The
answer
is
\
Phillips
curve
shows
that
whenever
unemployment
decreases
below
a
natural
level,
\
(C).
\
the
inflation
starts
increasing,
and
vice-versa.
The
answer
is
(D).
\n\n
Q:
Holding
\
\
all
else
equal
which
of
the
following
monetary
policies
would
be
used
to
boost
\
\
U.S.
exports?
\n
(A)
Increasing
the
discount
rate
(B)
Increasing
the
reserve
ratio
\
Q:
The
short-run
Phillips
curve
indicates
a
\
(C)
Buying
government
securities
(D)
Lowering
tariffs
\n
A:
Let's
think
step
by
\
\
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
help.
Buying
government
\
(A)
direct
relation
between
unemployment
and
inflation
(B)
direct
relation
between
\
securities
leads
to
reduction
in
demand
for
US
dollars
from
foreign
buyers,
thereby
\
price
and
quantity
demanded
(C)
inverse
relation
between
price
and
quantity
demanded
\
making
it
cheaper
and
hence
making
US
exports
more
attractive.
The
answer
is
(C).
\n\
(D)
inverse
relation
between
unemployment
and
inflation
\n
Q:
A
federal
deficit
occurs
when
\n
(A)
exports
exceed
imports.
(B)
imports
exceed
\
\
exports.
(C)
federal
tax
collections
exceed
spending.
(D)
federal
spending
exceeds
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
\
federal
tax
revenues.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
help.
The
short-run
Phillips
curve
shows
that
whenever
unemployment
decreases
below
\
on
macroeconomics
for
help.
A
federal
deficit
occurs
when
federal
spending
exceeds
\
a
natural
level,
the
inflation
starts
increasing,
and
vice-versa.
The
answer
is
\
federal
income
which
is
primarily
from
tax
revenues.
The
answer
is
(D).
\n\n
Q:
\
(D).
\
Which
of
the
following
is
not
included
in
the
U.S.
GDP?
\n
(A)
The
U.S.
military
\
\
opens
a
new
base
in
a
foreign
country
with
1000
U.S.
personnel.
(B)
Japanese
consumers
\
\
buy
thousands
of
CDs
produced
in
the
United
States.
(C)
An
American
pop
singer
\
Q:
Holding
all
else
equal
which
of
the
following
monetary
policies
would
be
used
\
performs
a
sold-out
concert
in
Paris.
(D)
A
French
theatrical
production
tours
\
to
boost
U.S.
exports?
\
dozens
of
American
cities.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
macroeconomics
for
help.
The
economic
transactions
related
to
the
\
(A)
Increasing
the
discount
rate
(B)
Increasing
the
reserve
ratio
(C)
Buying
government
\
performance
of
the
American
pop-singer
in
Paris
happens
entirely
outside
the
U.S.
\
securities
(D)
Lowering
tariffs
\
and
hence
is
not
included
in
the
GDP
numbers.
The
answer
is
(C)."
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
help.
Buying
government
securities
leads
to
reduction
in
demand
for
US
dollars
from
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_macroeconomics"
foreign
buyers,
thereby
making
it
cheaper
and
hence
making
US
exports
more
attractive.
The
answer
is
(C).
Q:
A
federal
deficit
occurs
when
(A)
exports
exceed
imports.
(B)
imports
exceed
exports.
(C)
federal
tax
collections
exceed
spending.
(D)
federal
spending
exceeds
federal
tax
revenues.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
help.
A
federal
deficit
occurs
when
federal
spending
exceeds
federal
income
which
is
primarily
from
tax
revenues.
The
answer
is
(D).
Q:
Which
of
the
following
is
not
included
in
the
U.S.
GDP?
(A)
The
U.S.
military
opens
a
new
base
in
a
foreign
country
with
1000
U.S.
personnel.
(B)
Japanese
consumers
buy
thousands
of
CDs
produced
in
the
United
States.
(C)
An
American
pop
singer
performs
a
sold-out
concert
in
Paris.
(D)
A
French
theatrical
production
tours
dozens
of
American
cities.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
macroeconomics
for
help.
The
economic
transactions
related
to
the
performance
of
the
American
pop-singer
in
Paris
happens
entirely
outside
the
U.S.
and
hence
is
not
included
in
the
GDP
numbers.
The
answer
is
(C).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_macroeconomics
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_mathematics.yaml
View file @
109ed1c7
dataset_name
:
high_school_mathematics
"
dataset_name
"
:
"
high_school_mathematics
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
mathematics.
\n\n
Q:
Simplify
and
write
the
result
with
a
rational
denominator:
\
\
school
mathematics.
\n\n
Q:
Simplify
and
write
the
result
with
a
rational
denominator:
\
\
$$
\\
sqrt{
\\
sqrt[3]{
\\
sqrt{
\\
frac{1}{729}}}}$$
\n
(A)
\\
frac{3
\\
sqrt{3}}{3}
(B)
\\\
\
$$
\\
sqrt{
\\
sqrt[3]{
\\
sqrt{
\\
frac{1}{729}}}}$$
\n
(A)
\\
frac{3
\\
sqrt{3}}{3}
(B)
\\\
frac{1}{3}
(C)
\\
sqrt{3}
(D)
\\
frac{
\\
sqrt{3}}{3}
\n
A:
Let's
think
step
by
step.
\
frac{1}{3}
(C)
\\
sqrt{3}
(D)
\\
frac{
\\
sqrt{3}}{3}
\n
A:
Let's
think
step
by
step.
\
...
@@ -13,7 +13,7 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -13,7 +13,7 @@ description: "The following are multiple choice questions (with answers) about h
\
of
$9600/300=32=2^5$.
Since
at
this
interest
rate
it
takes
six
years
for
it
to
\
\
of
$9600/300=32=2^5$.
Since
at
this
interest
rate
it
takes
six
years
for
it
to
\
\
double,
it
will
take
$5*6=30$
years
to
grow
to
$
\\
$9600$.
The
answer
is
(C).
\n\
\
double,
it
will
take
$5*6=30$
years
to
grow
to
$
\\
$9600$.
The
answer
is
(C).
\n\
\n
Q:
Ten
students
take
a
biology
test
and
receive
the
following
scores:
45,
55,
\
\n
Q:
Ten
students
take
a
biology
test
and
receive
the
following
scores:
45,
55,
\
\
50,
70,
65,
80,
40,
90,
70,
85.
What
is
the
mean
of
the
students
\u2019
test
scores?
\n\
\
50,
70,
65,
80,
40,
90,
70,
85.
What
is
the
mean
of
the
students
’
test
scores?
\n\
(A)
55
(B)
60
(C)
62
(D)
65
\n
A:
Let's
think
step
by
step.
There
are
10
students
\
(A)
55
(B)
60
(C)
62
(D)
65
\n
A:
Let's
think
step
by
step.
There
are
10
students
\
\
and
the
sum
of
their
scores
is
$45
+
55
+
50
+
70
+
65
+
80
+
40
+
90
+
70
+
85
\
\
and
the
sum
of
their
scores
is
$45
+
55
+
50
+
70
+
65
+
80
+
40
+
90
+
70
+
85
\
\
=
650$,
the
mean
is
$650/10=65$.
The
answer
is
(D).
\n\n
Q:
The
variable
$x$
varies
\
\
=
650$,
the
mean
is
$650/10=65$.
The
answer
is
(D).
\n\n
Q:
The
variable
$x$
varies
\
...
@@ -32,5 +32,6 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -32,5 +32,6 @@ description: "The following are multiple choice questions (with answers) about h
\
dance.)
\n
(A)
3
(B)
15
(C)
6
(D)
5
\n
A:
Let's
think
step
by
step.
The
least
common
\
\
dance.)
\n
(A)
3
(B)
15
(C)
6
(D)
5
\n
A:
Let's
think
step
by
step.
The
least
common
\
\
multiple
of
2,
3
and
5
is
30,
so
during
a
7
minute
dance,
all
the
three
lights
\
\
multiple
of
2,
3
and
5
is
30,
so
during
a
7
minute
dance,
all
the
three
lights
\
\
will
come
on
at
the
same
time
$2*7+1=15$
times.
The
answer
is
(B)."
\
will
come
on
at
the
same
time
$2*7+1=15$
times.
The
answer
is
(B)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
task
:
mmlu_flan_cot_fewshot_high_school_mathematics
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_mathematics"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_microeconomics.yaml
View file @
109ed1c7
dataset_name
:
high_school_microeconomics
"
dataset_name"
:
"
high_school_microeconomics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
microeconomics.
\
school
microeconomics.
\n\n
Q:
Which
of
the
following
is
necessarily
a
characteristic
\
\
of
oligopoly?
\n
(A)
Free
entry
into
and
exit
from
the
market
(B)
A
few
large
producers
\
\
(C)
One
producer
of
a
good
with
no
close
substitutes
(D)
A
homogenous
product
\n\
Q:
Which
of
the
following
is
necessarily
a
characteristic
of
oligopoly?
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
\
\
help.
An
oligopoly
is
when
a
market
is
dominated
by
just
one
or
a
few
number
of
\
(A)
Free
entry
into
and
exit
from
the
market
(B)
A
few
large
producers
(C)
One
producer
\
sellers
or
producers.
To
get
oligopoly,
the
market
should
have
high
barriers
to
\
of
a
good
with
no
close
substitutes
(D)
A
homogenous
product
\
new
entry,
and
the
product
has
differentiation.
The
answer
is
(B).
\n\n
Q:
If
the
\
\
government
subsidizes
producers
in
a
perfectly
competitive
market,
then
\n
(A)
the
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
\
demand
for
the
product
will
increase
(B)
the
demand
for
the
product
will
decrease
\
help.
An
oligopoly
is
when
a
market
is
dominated
by
just
one
or
a
few
number
of
\
(C)
the
consumer
surplus
will
increase
(D)
the
consumer
surplus
will
decrease
\n\
sellers
or
producers.
To
get
oligopoly,
the
market
should
have
high
barriers
to
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
\
new
entry,
and
the
product
has
differentiation.
The
answer
is
(B).
\
help.
(A)
and
(B)
are
wrong
because
the
demand
curve
does
not
change
at
all.
If
\
\
the
government
subsidizes
producers,
the
supply
will
increase,
and
thus
the
consumer
\
\
surplus
also
increases.
The
answer
is
(C).
\n\n
Q:
Which
of
the
following
is
true
\
Q:
If
the
government
subsidizes
producers
in
a
perfectly
competitive
market,
then
\
of
a
price
floor?
\n
(A)
The
price
floor
shifts
the
demand
curve
to
the
left.
(B)
\
\
An
effective
floor
creates
a
shortage
of
the
good.
(C)
The
price
floor
shifts
\
(A)
the
demand
for
the
product
will
increase
(B)
the
demand
for
the
product
will
\
the
supply
curve
of
the
good
to
the
right.
(D)
To
be
an
effective
floor,
it
must
\
decrease
(C)
the
consumer
surplus
will
increase
(D)
the
consumer
surplus
will
decrease
\
be
set
above
the
equilibrium
price.
\n
A:
Let's
think
step
by
step.
We
refer
to
\
\
Wikipedia
articles
on
microeconomics
for
help.
Price
floor
does
not
shift
the
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
\
demand
or
shift
curve.
An
effective
price
floor
should
be
set
above
the
equilibrium
\
help.
(A)
and
(B)
are
wrong
because
the
demand
curve
does
not
change
at
all.
If
\
price,
otherwise
the
market
bears
and
the
floor
does
not
have
effective
effect.
\
the
government
subsidizes
producers,
the
supply
will
increase,
and
thus
the
consumer
\
The
answer
is
(D).
\n\n
Q:
The
concentration
ratio
for
a
monopoly
is
\n
(A)
0
(B)
\
surplus
also
increases.
The
answer
is
(C).
\
5
(C)
10
(D)
100
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
\
on
microeconomics
for
help.
The
concentration
ratio
is
calculated
as
the
sum
of
\
\
market
share
of
a
specific
number
of
largest
companies.
Monopoly
means
one
company
\
Q:
Which
of
the
following
is
true
of
a
price
floor?
\
or
entity
controls
the
entire
market,
therefore,
the
concentration
ratio
is
100
\
\
percent.
The
answer
is
(D).
\n\n
Q:
In
a
competitive
labor
market
for
housepainters,
\
(A)
The
price
floor
shifts
the
demand
curve
to
the
left.
(B)
An
effective
floor
\
which
of
the
following
would
increase
the
demand
for
housepainters?
\n
(A)
An
effective
\
creates
a
shortage
of
the
good.
(C)
The
price
floor
shifts
the
supply
curve
of
the
\
minimum
wage
imposed
on
this
labor
market.
(B)
An
increase
in
the
price
of
gallons
\
good
to
the
right.
(D)
To
be
an
effective
floor,
it
must
be
set
above
the
equilibrium
\
of
paint.
(C)
An
increase
in
the
construction
of
new
houses.
(D)
An
increase
in
\
price.
\
the
price
of
mechanical
painters
so
long
as
the
output
effect
exceeds
the
substitution
\
\
effect.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
\
for
help.
An
increase
in
the
construction
of
new
houses
means
an
increase
demand
\
help.
Price
floor
does
not
shift
the
demand
or
shift
curve.
An
effective
price
floor
\
of
in-house
painting,
thus
increases
the
demand
for
housepainters.
The
answer
\
should
be
set
above
the
equilibrium
price,
otherwise
the
market
bears
and
the
floor
\
is
(C)."
does
not
have
effective
effect.
The
answer
is
(D).
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_microeconomics"
Q:
The
concentration
ratio
for
a
monopoly
is
(A)
0
(B)
5
(C)
10
(D)
100
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
help.
The
concentration
ratio
is
calculated
as
the
sum
of
market
share
of
a
specific
number
of
largest
companies.
Monopoly
means
one
company
or
entity
controls
the
entire
market,
therefore,
the
concentration
ratio
is
100
percent.
The
answer
is
(D).
Q:
In
a
competitive
labor
market
for
housepainters,
which
of
the
following
would
increase
the
demand
for
housepainters?
(A)
An
effective
minimum
wage
imposed
on
this
labor
market.
(B)
An
increase
in
the
price
of
gallons
of
paint.
(C)
An
increase
in
the
construction
of
new
houses.
(D)
An
increase
in
the
price
of
mechanical
painters
so
long
as
the
output
effect
exceeds
the
substitution
effect.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
help.
An
increase
in
the
construction
of
new
houses
means
an
increase
demand
of
in-house
painting,
thus
increases
the
demand
for
housepainters.
The
answer
is
(C).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_microeconomics
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_physics.yaml
View file @
109ed1c7
dataset_name
:
high_school_physics
"
dataset_name
"
:
"
high_school_physics
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
physics.
\n\n
Q:
A
microwave
oven
is
connected
to
an
outlet,
120
V,
and
draws
\
\
school
physics.
\n\n
Q:
A
microwave
oven
is
connected
to
an
outlet,
120
V,
and
draws
\
\
a
current
of
2
amps.
At
what
rate
is
energy
being
used
by
the
microwave
oven?
\n\
\
a
current
of
2
amps.
At
what
rate
is
energy
being
used
by
the
microwave
oven?
\n\
(A)
10
W
(B)
30
W
(C)
60
W
(D)
240
W
\n
A:
Let's
think
step
by
step.
Rate
of
energy
\
(A)
10
W
(B)
30
W
(C)
60
W
(D)
240
W
\n
A:
Let's
think
step
by
step.
Rate
of
energy
\
\
usage
is
known
as
power;
in
an
dissipative
electrical
circuit,
power
is
given
\
\
usage
is
known
as
power;
in
an
dissipative
electrical
circuit,
power
is
given
\
\
by
voltage
times
current.
So
in
our
case,
the
power
is
120
V
times
2
amps,
or
\
\
by
voltage
times
current.
So
in
our
case,
the
power
is
120
V
times
2
amps,
or
\
\
240
W.
The
answer
is
(D).
\n\n
Q:
A
point
charge,
Q
=
+1
mC,
is
fixed
at
the
origin.
\
\
240
W.
The
answer
is
(D).
\n\n
Q:
A
point
charge,
Q
=
+1
mC,
is
fixed
at
the
origin.
\
\
How
much
work
is
required
to
move
a
charge,
Q
=
+8
\xB5
C,
from
the
point
(0,
4
\
\
How
much
work
is
required
to
move
a
charge,
Q
=
+8
µC,
from
the
point
(0,
4
meters)
\
\
meters)
to
the
point
(3
meters,
0)?
\n
(A)
3.5
J
(B)
6.0
J
(C)
22.5
J
(D)
40
J
\n\
\
to
the
point
(3
meters,
0)?
\n
(A)
3.5
J
(B)
6.0
J
(C)
22.5
J
(D)
40
J
\n
A:
Let's
\
A:
Let's
think
step
by
step.
To
calculate
the
work
required
to
move
a
charge
from
\
\
think
step
by
step.
To
calculate
the
work
required
to
move
a
charge
from
one
location
\
\
one
location
to
another
in
a
fixed
electric
field,
it
is
enough
to
calculate
the
\
\
to
another
in
a
fixed
electric
field,
it
is
enough
to
calculate
the
potential
\
\
potential
difference
between
the
two
locations.
Here,
the
potential
only
depends
\
\
difference
between
the
two
locations.
Here,
the
potential
only
depends
on
the
\
\
on
the
distance
between
the
charges;
it
\u2019
s
$k
q_1
q_2
/
r$,
where
$k$
is
Coulomb
\u2019\
\
distance
between
the
charges;
it’s
$k
q_1
q_2
/
r$,
where
$k$
is
Coulomb’s
constant.
\
s
constant.
Plugging
in
values
$q_1
=
$
1
mC,
$q_2
=
8
\\
mu$
C,
gives
the
answer
\
\
Plugging
in
values
$q_1
=
$
1
mC,
$q_2
=
8
\\
mu$
C,
gives
the
answer
as
5.992
\
\
as
5.992
J,
which
rounds
to
6
J.
The
answer
is
(B).
\n\n
Q:
Which
of
the
following
\
\
J,
which
rounds
to
6
J.
The
answer
is
(B).
\n\n
Q:
Which
of
the
following
conditions
\
\
conditions
will
ensure
that
angular
momentum
is
conserved?
I.
Conservation
of
\
\
will
ensure
that
angular
momentum
is
conserved?
I.
Conservation
of
linear
momentum
\
\
linear
momentum
II.
Zero
net
external
force
III.
Zero
net
external
torque
\n
(A)
\
\
II.
Zero
net
external
force
III.
Zero
net
external
torque
\n
(A)
I
and
II
only
(B)
\
\
I
and
II
only
(B)
I
and
III
only
(C)
II
and
III
only
(D)
III
only
\n
A:
Let's
think
\
\
I
and
III
only
(C)
II
and
III
only
(D)
III
only
\n
A:
Let's
think
step
by
step.
\
\
step
by
step.
Torque
is
defined
as
the
change
in
angular
momentum;
if
there
is
\
\
Torque
is
defined
as
the
change
in
angular
momentum;
if
there
is
zero
external
\
\
zero
external
torque,
angular
momentum
is
conserved.
The
answer
is
(D).
\n\n
Q:
\
\
torque,
angular
momentum
is
conserved.
The
answer
is
(D).
\n\n
Q:
A
photocell
of
\
\
A
photocell
of
work
function
\u03D5
=
2eV
is
connected
to
a
resistor
in
series.
\
\
work
function
ϕ
=
2eV
is
connected
to
a
resistor
in
series.
Light
of
frequency
\
\
Light
of
frequency
f
=
1
\xD7
10^15
Hz
hits
a
metal
plate
of
the
photocell.
If
\
\
f
=
1
×
10^15
Hz
hits
a
metal
plate
of
the
photocell.
If
the
power
of
the
light
\
\
the
power
of
the
light
is
P
=
100
W,
what
is
the
current
through
the
resistor?
\n\
\
is
P
=
100
W,
what
is
the
current
through
the
resistor?
\n
(A)
2:00
AM
(B)
6:00
\
(A)
2:00
AM
(B)
6:00
AM
(C)
12:00
AM
(D)
24
A
\n
A:
Let's
think
step
by
step.
The
\
\
AM
(C)
12:00
AM
(D)
24
A
\n
A:
Let's
think
step
by
step.
The
only
answer
above
which
\
\
only
answer
above
which
has
units
of
current
is
D,
24
A.
The
answer
is
(D).
\n\n\
\
has
units
of
current
is
D,
24
A.
The
answer
is
(D).
\n\n
Q:
A
pipe
full
of
air
is
\
Q:
A
pipe
full
of
air
is
closed
at
one
end.
A
standing
wave
is
produced
in
the
pipe,
\
\
closed
at
one
end.
A
standing
wave
is
produced
in
the
pipe,
causing
the
pipe
to
\
\
causing
the
pipe
to
sound
a
note.
Which
of
the
following
is
a
correct
statement
\
\
sound
a
note.
Which
of
the
following
is
a
correct
statement
about
the
wave’s
properties
\
\
about
the
wave
\u2019
s
properties
at
the
closed
end
of
the
pipe?
\n
(A)
The
pressure
\
\
at
the
closed
end
of
the
pipe?
\n
(A)
The
pressure
is
at
a
node,
but
the
particle
\
\
is
at
a
node,
but
the
particle
displacement
is
at
an
antinode.
(B)
The
pressure
\
\
displacement
is
at
an
antinode.
(B)
The
pressure
is
at
an
antinode,
but
the
particle
\
\
is
at
an
antinode,
but
the
particle
displacement
is
at
a
node.
(C)
The
pressure
\
\
displacement
is
at
a
node.
(C)
The
pressure
and
the
particle
displacement
are
\
\
and
the
particle
displacement
are
both
at
nodes.
(D)
The
pressure
and
the
particle
\
\
both
at
nodes.
(D)
The
pressure
and
the
particle
displacement
are
both
at
antinodes.
\n\
\
displacement
are
both
at
antinodes.
\n
A:
Let's
think
step
by
step.
At
the
closed
\
A:
Let's
think
step
by
step.
At
the
closed
end
of
the
pipe,
the
particles
cannot
\
\
end
of
the
pipe,
the
particles
cannot
have
any
net
displacement
because
the
pipe
\
\
have
any
net
displacement
because
the
pipe
closure
stops
them.
So
the
particle
\
\
closure
stops
them.
So
the
particle
displacement
is
at
a
node.
This
closure
also
\
\
displacement
is
at
a
node.
This
closure
also
causes
the
pressure
to
be
maximal,
\
\
causes
the
pressure
to
be
maximal,
i.e.
an
antinode.
The
answer
is
(B)."
\
i.e.
an
antinode.
The
answer
is
(B)."
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
task
:
mmlu_flan_cot_fewshot_high_school_physics
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_physics"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_psychology.yaml
View file @
109ed1c7
dataset_name
:
high_school_psychology
"
dataset_name"
:
"
high_school_psychology"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
psychology.
\
school
psychology.
\n\n
Q:
Pascale
is
interested
in
the
processing
strategies
children
\
\
use
to
learn
new
information.
Pascale
would
best
be
classified
as
what
type
of
\
\
psychologist?
\n
(A)
sociocultural
(B)
clinical
(C)
cognitive
(D)
behaviorist
\n\
Q:
Pascale
is
interested
in
the
processing
strategies
children
use
to
learn
new
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
\
information.
Pascale
would
best
be
classified
as
what
type
of
psychologist?
\
Sociocultural
psychologist
focuses
on
the
effect
of
societal
factors
on
people.
\
\
Clinical
psychologist
focuses
on
people
with
mental
issues.
Cognitive
psychologist
\
(A)
sociocultural
(B)
clinical
(C)
cognitive
(D)
behaviorist
\
focuses
on
how
people
think
and
learn,
including
the
processing
strategies.
Behaviorist
\
\
focuses
more
on
the
environment
and
experience
effect
on
people.
The
answer
is
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
\
(C).
\n\n
Q:
According
to
Caplan's
model
of
consultee-centered
case
consultation,
\
Sociocultural
psychologist
focuses
on
the
effect
of
societal
factors
on
people.
\
the
consultant
is
primarily
interested
in
\n
(A)
identifying
the
causes
and
solutions
\
Clinical
psychologist
focuses
on
people
with
mental
issues.
Cognitive
psychologist
\
of
the
client's
presenting
problems
(B)
identifying
and
eliminating
the
causes
\
focuses
on
how
people
think
and
learn,
including
the
processing
strategies.
Behaviorist
\
of
the
consultee's
difficulties
in
handling
a
problem
(C)
establishing
a
hierarchy
\
focuses
more
on
the
environment
and
experience
effect
on
people.
The
answer
is
(C).
\
of
authority
to
enable
effective
decision
making
(D)
presenting
a
single,
well-defined
\
\
and
unambiguous
course
of
action
for
the
consultant
to
overcome
skills
deficits
\n\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
\
Q:
According
to
Caplan'
'
s
model
of
consultee-centered
case
consultation,
the
consultant
\
Caplan
defines
two
type
of
consultation.
Client-centered
case
consultation
aims
\
is
primarily
interested
in
\
to
handle
client's
problems,
while
consultee-centered
case
consultation
aims
to
\
\
identify
the
reason
of
client's
difficulty
to
solve
problems.
The
answer
is
(B).
\n\
(A)
identifying
the
causes
and
solutions
of
the
client'
'
s
presenting
problems
(B)
\n
Q:
According
to
the
Individuals
with
Disabilities
Education
Improvement
Act,
which
\
identifying
and
eliminating
the
causes
of
the
consultee'
'
s
difficulties
in
handling
\
of
the
following
must
an
educational
agency
do
before
it
changes
the
educational
\
a
problem
(C)
establishing
a
hierarchy
of
authority
to
enable
effective
decision
\
placement
of
a
student
with
a
disability?
\n
(A)
Give
the
child
a
trial
period
in
\
making
(D)
presenting
a
single,
well-defined
and
unambiguous
course
of
action
for
\
the
new
environment
(B)
Notify
the
parents
in
writing
(C)
Obtain
school
board
\
the
consultant
to
overcome
skills
deficits
\
approval
(D)
Obtain
parental
consent
\n
A:
Let's
think
step
by
step.
We
refer
to
\
\
Wikipedia
articles
on
psychology
for
help.
When
the
decision
to
change
the
educational
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
\
placement
of
a
student
with
a
disability
is
made,
the
educational
agency
must
\
Caplan
defines
two
type
of
consultation.
Client-centered
case
consultation
aims
\
notify
the
parents
in
writing
on
that
date.
The
answer
is
(B).
\n\n
Q:
While
swimming
\
to
handle
client'
'
s
problems,
while
consultee-centered
case
consultation
aims
to
\
in
the
ocean,
Ivan
is
frightened
by
a
dark
shadow
in
the
water
even
before
he
\
identify
the
reason
of
client'
'
s
difficulty
to
solve
problems.
The
answer
is
(B).
\
has
the
chance
to
identify
what
the
shadow
is.
The
synaptic
connections
taking
\
\
place
during
this
incident
of
fright
are
best
described
by
which
of
the
following?
\n\
(A)
Messages
are
sent
from
the
thalamus
directly
to
the
amygdala.
(B)
Messages
are
\
Q:
According
to
the
Individuals
with
Disabilities
Education
Improvement
Act,
which
\
sent
from
the
thalamus
to
the
\"
what
\"
and
\"
where
\"
pathways.
(C)
Messages
are
\
of
the
following
must
an
educational
agency
do
before
it
changes
the
educational
\
sent
from
the
parasympathetic
nervous
system
to
the
cerebral
cortex.
(D)
Messages
\
placement
of
a
student
with
a
disability?
\
are
sent
from
the
frontal
lobes
to
the
pituitary
gland.
\n
A:
Let's
think
step
by
\
\
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
Our
neural
system
\
(A)
Give
the
child
a
trial
period
in
the
new
environment
(B)
Notify
the
parents
\
has
a
mechanism
that
can
respond
immediate
emotional
signal
before
going
to
the
\
in
writing
(C)
Obtain
school
board
approval
(D)
Obtain
parental
consent
\
thought
center.
In
the
Ivan's
case,
messages
travel
directly
from
thalamus
to
\
\
amygdala.
The
answer
is
(A).
\n\n
Q:
Ani
believes
that
her
attitudes
and
behavior
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
\
play
a
central
role
in
what
happens
to
her.
Such
a
belief
is
likely
to
be
associated
\
When
the
decision
to
change
the
educational
placement
of
a
student
with
a
disability
\
with
\n
(A)
a
strong
superego.
(B)
low
self-esteem.
(C)
low
self-efficacy.
(D)
an
\
is
made,
the
educational
agency
must
notify
the
parents
in
writing
on
that
date.
\
internal
locus
of
control.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
The
answer
is
(B).
\
articles
on
psychology
for
help.
People
with
an
external
locus
of
control
believes
\
\
fate
and
luck
play
an
important
role
in
their
lives,
while
people
with
an
internal
\
\
locus
of
control
believes
they
control
their
lives.
The
answer
is
(D)."
Q:
While
swimming
in
the
ocean,
Ivan
is
frightened
by
a
dark
shadow
in
the
water
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
even
before
he
has
the
chance
to
identify
what
the
shadow
is.
The
synaptic
connections
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
taking
place
during
this
incident
of
fright
are
best
described
by
which
of
the
following?
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_psychology"
(A)
Messages
are
sent
from
the
thalamus
directly
to
the
amygdala.
(B)
Messages
are
sent
from
the
thalamus
to
the
"what"
and
"where"
pathways.
(C)
Messages
are
sent
from
the
parasympathetic
nervous
system
to
the
cerebral
cortex.
(D)
Messages
are
sent
from
the
frontal
lobes
to
the
pituitary
gland.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
Our
neural
system
has
a
mechanism
that
can
respond
immediate
emotional
signal
before
going
to
the
thought
center.
In
the
Ivan'
'
s
case,
messages
travel
directly
from
thalamus
to
amygdala.
The
answer
is
(A).
Q:
Ani
believes
that
her
attitudes
and
behavior
play
a
central
role
in
what
happens
to
her.
Such
a
belief
is
likely
to
be
associated
with
(A)
a
strong
superego.
(B)
low
self-esteem.
(C)
low
self-efficacy.
(D)
an
internal
locus
of
control.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
psychology
for
help.
People
with
an
external
locus
of
control
believes
fate
and
luck
play
an
important
role
in
their
lives,
while
people
with
an
internal
locus
of
control
believes
they
control
their
lives.
The
answer
is
(D).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_psychology
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_statistics.yaml
View file @
109ed1c7
dataset_name
:
high_school_statistics
"
dataset_name"
:
"
high_school_statistics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
school
statistics.
\
school
statistics.
\n\n
Q:
A
new
smartwatch
is
manufactured
in
one
part
of
a
factory,
\
\
then
secured
for
shipping
in
another,
independent
part
of
the
factory.
The
weight
\
\
of
the
smartwatch
has
a
mean
of
62
grams
and
a
standard
deviation
of
1.0
grams.
\
Q:
A
new
smartwatch
is
manufactured
in
one
part
of
a
factory,
then
secured
for
shipping
\
The
weight
of
the
packaging
(box,
user's
guide,
bubble
wrap,
etc.)
has
a
mean
\
in
another,
independent
part
of
the
factory.
The
weight
of
the
smartwatch
has
a
\
of
456
grams
and
a
standard
deviation
of
6
grams.
Together,
the
distribution
of
\
mean
of
62
grams
and
a
standard
deviation
of
1.0
grams.
The
weight
of
the
packaging
\
the
weight
of
the
smartwatch
and
its
packaging
would
have
the
following
mean
and
\
(box,
user'
'
s
guide,
bubble
wrap,
etc.)
has
a
mean
of
456
grams
and
a
standard
deviation
\
standard
deviation:
\n
(A)
Mean
518
grams;
standard
deviation
7.0
grams
(B)
Mean
\
of
6
grams.
Together,
the
distribution
of
the
weight
of
the
smartwatch
and
its
packaging
\
518
grams;
standard
deviation
3.5
grams
(C)
Mean
518
grams;
standard
deviation
\
would
have
the
following
mean
and
standard
deviation:
\
6.1
grams
(D)
Mean
394
grams;
standard
deviation
6.1
grams
\n
A:
Let's
think
step
\
\
by
step.
Since
the
weight
of
the
watch
and
the
weight
of
the
packaging
are
independent
\
(A)
Mean
518
grams;
standard
deviation
7.0
grams
(B)
Mean
518
grams;
standard
deviation
\
random
variables,
the
mean
and
variance
of
their
sum
is
equal
to
the
sum
of
their
\
3.5
grams
(C)
Mean
518
grams;
standard
deviation
6.1
grams
(D)
Mean
394
grams;
standard
\
individual
means
and
variances.
So
the
mean
is
62
+
456
=
518
grams,
and
the
variances
\
deviation
6.1
grams
\
is
1.0^2
+
6.0^2
=
37,
leading
to
a
standard
deviation
of
6.1
grams.
The
answer
\
\
is
(C).
\n\n
Q:
After
a
frost
warning
was
issued,
the
owner
of
a
large
orange
grove
\
A:
Let'
'
s
think
step
by
step.
Since
the
weight
of
the
watch
and
the
weight
of
the
\
asked
his
workers
to
spray
all
his
trees
with
water.
The
water
was
supposed
to
\
packaging
are
independent
random
variables,
the
mean
and
variance
of
their
sum
is
\
freeze
and
form
a
protective
covering
of
ice
around
the
orange
blossom.
Nevertheless,
\
equal
to
the
sum
of
their
individual
means
and
variances.
So
the
mean
is
62
+
456
\
the
owner
suspected
that
some
trees
suffered
considerable
damage
due
to
the
frost.
\
=
518
grams,
and
the
variances
is
1.0^2
+
6.0^2
=
37,
leading
to
a
standard
deviation
\
To
estimate
the
proportion
of
trees
that
suffered
more
than
50
percent
damage
\
of
6.1
grams.
The
answer
is
(C).
\
due
to
the
frost,
he
took
a
random
sample
of
100
trees
from
his
grove.
What
is
\
\
the
response
variable
in
this
experiment?
\n
(A)
The
proportion
of
trees
that
suffered
\
\
more
than
50
percent
damage
due
to
frost.
(B)
The
number
of
trees
affected
by
\
Q:
After
a
frost
warning
was
issued,
the
owner
of
a
large
orange
grove
asked
his
\
the
frost.
(C)
The
number
of
trees
sampled
from
the
grove.
(D)
For
each
sampled
\
workers
to
spray
all
his
trees
with
water.
The
water
was
supposed
to
freeze
and
\
tree,
whether
it
suffered
more
than
50
percent
damage
or
at
most
50
percent
damage.
\n\
form
a
protective
covering
of
ice
around
the
orange
blossom.
Nevertheless,
the
owner
A:
Let's
think
step
by
step.
In
this
experiment,
the
response
variable
is
what
is
\
suspected
that
some
trees
suffered
considerable
damage
due
to
the
frost.
To
estimate
\
measured.
For
each
tree,
what
is
measured
is
whether
or
not
it
suffered
more
than
\
the
proportion
of
trees
that
suffered
more
than
50
percent
damage
due
to
the
frost,
\
50
percent
damage
due
to
the
frost.
The
answer
is
(D).
\n\n
Q:
Suppose
X
and
Y
are
\
he
took
a
random
sample
of
100
trees
from
his
grove.
What
is
the
response
variable
\
random
variables
with
E(X)
=
37,
var(X)
=
5,
E(Y)
=
62,
and
var(Y)
=
12.
What
\
in
this
experiment?
\
are
the
expected
value
and
variance
of
the
random
variable
X
+
Y?
\n
(A)
E(X
+
Y)
\
\
=
99,
var(X
+
Y)
=
8.5
(B)
E(X
+
Y)
=
99,
var(X
+
Y)
=
13
(C)
E(X
+
Y)
=
99,
var(X
\
(A)
The
proportion
of
trees
that
suffered
more
than
50
percent
damage
due
to
frost.
\
+
Y)
=
17
(D)
There
is
insufficient
information
to
answer
this
question.
\n
A:
Let's
\
(B)
The
number
of
trees
affected
by
the
frost.
(C)
The
number
of
trees
sampled
from
\
think
step
by
step.
While
means
of
sums
of
random
variables
add
(regardless
of
\
the
grove.
(D)
For
each
sampled
tree,
whether
it
suffered
more
than
50
percent
damage
\
whether
the
variables
are
independent)
in
order
to
determine
the
variance
of
a
\
or
at
most
50
percent
damage.
\
sum
of
random
variables,
we
need
to
know
not
just
their
individual
variances
but
\
\
the
covariance
of
the
two
variables,
which
is
not
given
in
this
problem.
The
answer
\
A:
Let'
'
s
think
step
by
step.
In
this
experiment,
the
response
variable
is
what
\
is
(D).
\n\n
Q:
Which
of
the
following
sets
has
the
smallest
standard
deviation?
\
is
measured.
For
each
tree,
what
is
measured
is
whether
or
not
it
suffered
more
\
Which
has
the
largest?
\n
I:
{1,2,3}
\n
II:
{-10,10}
\n
III:
{100}
\n
(A)
I,
II
(B)
II,
\
than
50
percent
damage
due
to
the
frost.
The
answer
is
(D).
\
III
(C)
III,
I
(D)
III,
II
\n
A:
Let's
think
step
by
step.
The
variance
of
distribution
\
\
I
is
the
expected
squared
deviation
from
its
mean
(which
is
2),
so
the
variance
\
\
is
2/3
.
The
variance
of
distribution
II
is
10^2
(because
both
elements
are
10
\
Q:
Suppose
X
and
Y
are
random
variables
with
E(X)
=
37,
var(X)
=
5,
E(Y)
=
62,
and
\
away
from
the
mean
of
zero).
The
variance
of
distribution
III
is
0,
since
it
has
\
var(Y)
=
12.
What
are
the
expected
value
and
variance
of
the
random
variable
X
+
\
a
single
entry.
So
distribution
III
has
the
smallest
standard
deviation
and
distribution
\
Y?
\
II
has
the
largest.
The
answer
is
(D).
\n\n
Q:
Which
of
the
following
is
a
correct
\
\
statement
about
correlation?
\n
(A)
If
the
slope
of
the
regression
line
is
exactly
\
(A)
E(X
+
Y)
=
99,
var(X
+
Y)
=
8.5
(B)
E(X
+
Y)
=
99,
var(X
+
Y)
=
13
(C)
E(X
+
\
1,
then
the
correlation
is
exactly
1.
(B)
If
the
correlation
is
0,
then
the
slope
\
Y)
=
99,
var(X
+
Y)
=
17
(D)
There
is
insufficient
information
to
answer
this
question.
\
of
the
regression
line
is
undefined.
(C)
Switching
which
variable
is
called
x
\
\
and
which
is
called
y
changes
the
sign
of
the
correlation.
(D)
The
correlation
\
A:
Let'
'
s
think
step
by
step.
While
means
of
sums
of
random
variables
add
(regardless
\
r
is
equal
to
the
slope
of
the
regression
line
when
z-scores
for
the
y-variable
\
of
whether
the
variables
are
independent)
in
order
to
determine
the
variance
of
\
are
plotted
against
z-scores
for
the
x-variable.
\n
A:
Let's
think
step
by
step.
\
a
sum
of
random
variables,
we
need
to
know
not
just
their
individual
variances
but
\
Statement
A
is
false
because
the
slope
of
the
regression
line
being
exactly
1
\
the
covariance
of
the
two
variables,
which
is
not
given
in
this
problem.
The
answer
\
can
occur
even
when
the
two
variables
are
not
perfectly
correlated.
Statement
\
is
(D).
\
B
is
false
because
uncorrelated
variables
regression
lines
can
have
slope
zero.
\
\
Statement
C
is
false
because
correlation
is
symmetric
in
the
two
random
variables.
\
\
The
answer
is
(D)."
Q:
Which
of
the
following
sets
has
the
smallest
standard
deviation?
Which
has
the
"
group"
:
"
mmlu_flan_cot_fewshot_stem"
largest?
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_high_school_statistics"
I:
{1,2,3}
II:
{-10,10}
III:
{100}
(A)
I,
II
(B)
II,
III
(C)
III,
I
(D)
III,
II
A:
Let'
'
s
think
step
by
step.
The
variance
of
distribution
I
is
the
expected
squared
deviation
from
its
mean
(which
is
2),
so
the
variance
is
2/3
.
The
variance
of
distribution
II
is
10^2
(because
both
elements
are
10
away
from
the
mean
of
zero).
The
variance
of
distribution
III
is
0,
since
it
has
a
single
entry.
So
distribution
III
has
the
smallest
standard
deviation
and
distribution
II
has
the
largest.
The
answer
is
(D).
Q:
Which
of
the
following
is
a
correct
statement
about
correlation?
(A)
If
the
slope
of
the
regression
line
is
exactly
1,
then
the
correlation
is
exactly
1.
(B)
If
the
correlation
is
0,
then
the
slope
of
the
regression
line
is
undefined.
(C)
Switching
which
variable
is
called
x
and
which
is
called
y
changes
the
sign
of
the
correlation.
(D)
The
correlation
r
is
equal
to
the
slope
of
the
regression
line
when
z-scores
for
the
y-variable
are
plotted
against
z-scores
for
the
x-variable.
A:
Let'
'
s
think
step
by
step.
Statement
A
is
false
because
the
slope
of
the
regression
line
being
exactly
1
can
occur
even
when
the
two
variables
are
not
perfectly
correlated.
Statement
B
is
false
because
uncorrelated
variables
regression
lines
can
have
slope
zero.
Statement
C
is
false
because
correlation
is
symmetric
in
the
two
random
variables.
The
answer
is
(D).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_high_school_statistics
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_us_history.yaml
View file @
109ed1c7
This diff is collapsed.
Click to expand it.
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_high_school_world_history.yaml
View file @
109ed1c7
dataset_name
:
high_school_world_history
"
dataset_name
"
:
"
high_school_world_history
"
description
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
"
description
"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
high
\
\
school
world
history.
\n\n
Q:
This
question
refers
to
the
following
information.
\n\
\
school
world
history.
\n\n
Q:
This
question
refers
to
the
following
information.
\n\
\"
At
least
one
of
the
[world's]
societies
would
have
to
somehow
enormously
increase
\
\"
At
least
one
of
the
[world's]
societies
would
have
to
somehow
enormously
increase
\
\
its
productivity
[in
order
to
achieve
global
hegemony].
That
quantum
jump
would
\
\
its
productivity
[in
order
to
achieve
global
hegemony].
That
quantum
jump
would
\
...
@@ -8,75 +8,75 @@ description: "The following are multiple choice questions (with answers) about h
...
@@ -8,75 +8,75 @@ description: "The following are multiple choice questions (with answers) about h
\
be
accomplished
by
exploiting
the
ecosystems,
mineral
resources,
and
human
assets
\
\
be
accomplished
by
exploiting
the
ecosystems,
mineral
resources,
and
human
assets
\
\
of
whole
continents
outside
the
lands
of
the
society
making
the
jump.
Western
\
\
of
whole
continents
outside
the
lands
of
the
society
making
the
jump.
Western
\
\
Europe
did
just
that
by
means
of
its
brutality
and
guns
and,
more
important,
by
\
\
Europe
did
just
that
by
means
of
its
brutality
and
guns
and,
more
important,
by
\
\
geographical
and
ecological
luck.
\"\n
Copyright
\xA9
2015
Cambridge
University
\
\
geographical
and
ecological
luck.
\"\n
Copyright
©
2015
Cambridge
University
Press.
\n
\
\
Press.
\n
Alfred
Crosby,
historian,
Ecological
Imperialism,
2004
\n
The
\"
quantum
\
Alfred
Crosby,
historian,
Ecological
Imperialism,
2004
\n
The
\"
quantum
jump
\"
mentioned
\
\
jump
\"
mentioned
in
the
passage
most
directly
contributed
to
which
of
the
following
\
\
in
the
passage
most
directly
contributed
to
which
of
the
following
developments
\
\
developments
in
the
period
1450
\u2013
1750
C.E.?
\n
(A)
A
breakdown
in
trade
routes
\
\
in
the
period
1450
–
1750
C.E.?
\n
(A)
A
breakdown
in
trade
routes
through
the
collapse
\
\
through
the
collapse
of
the
established
state
structure
(B)
An
increase
in
the
\
\
of
the
established
state
structure
(B)
An
increase
in
the
population
of
the
world
\
\
population
of
the
world
through
more
plentiful
supplies
of
food
(C)
The
spread
\
\
through
more
plentiful
supplies
of
food
(C)
The
spread
of
Chinese
and
Indian
belief
\
\
of
Chinese
and
Indian
belief
systems
across
the
world
(D)
An
increase
in
social
\
\
systems
across
the
world
(D)
An
increase
in
social
unrest
\n
A:
Let's
think
step
\
\
unrest
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
history
\
\
by
step.
We
refer
to
Wikipedia
articles
on
world
history
for
help.
The
\"
quantum
\
\
for
help.
The
\"
quantum
jump
\"
mentioned
in
the
passage
refers
to
the
conquest
\
\
jump
\"
mentioned
in
the
passage
refers
to
the
conquest
of
the
New
World
and
the
\
\
of
the
New
World
and
the
Columbian
Exchange.
Choice
(A)
and
(C)
did
not
happen
\
\
Columbian
Exchange.
Choice
(A)
and
(C)
did
not
happen
in
history.
Choice
(C)
refers
\
\
in
history.
Choice
(C)
refers
to
the
human
assets.
The
answer
is
(B).
\n\n
Q:
This
\
\
to
the
human
assets.
The
answer
is
(B).
\n\n
Q:
This
question
refers
to
the
following
\
\
question
refers
to
the
following
information.
\n\"
The
struggle
against
neo-colonialism
\
\
information.
\n\"
The
struggle
against
neo-colonialism
is
not
aimed
at
excluding
\
\
is
not
aimed
at
excluding
the
capital
of
the
developed
world
from
operating
in
\
\
the
capital
of
the
developed
world
from
operating
in
less
developed
countries.
\
\
less
developed
countries.
It
is
aimed
at
preventing
the
financial
power
of
the
\
\
It
is
aimed
at
preventing
the
financial
power
of
the
developed
countries
being
\
\
developed
countries
being
used
in
such
a
way
as
to
impoverish
the
less
developed.
\n\
\
used
in
such
a
way
as
to
impoverish
the
less
developed.
\n
Non-alignment,
as
practiced
\
Non-alignment,
as
practiced
by
Ghana
and
many
other
countries,
is
based
on
co-operation
\
\
by
Ghana
and
many
other
countries,
is
based
on
co-operation
with
all
States
whether
\
\
with
all
States
whether
they
be
capitalist,
socialist
or
have
a
mixed
economy.
\
\
they
be
capitalist,
socialist
or
have
a
mixed
economy.
Such
a
policy,
therefore,
\
\
Such
a
policy,
therefore,
involves
foreign
investment
from
capitalist
countries,
\
\
involves
foreign
investment
from
capitalist
countries,
but
it
must
be
invested
\
\
but
it
must
be
invested
in
accordance
with
a
national
plan
drawn
up
by
the
government
\
\
in
accordance
with
a
national
plan
drawn
up
by
the
government
of
the
non-aligned
\
\
of
the
non-aligned
State
with
its
own
interests
in
mind.
The
issue
is
not
what
\
\
State
with
its
own
interests
in
mind.
The
issue
is
not
what
return
the
foreign
\
\
return
the
foreign
investor
receives
on
his
investments
\u2026
The
question
is
one
\
\
investor
receives
on
his
investments
…
The
question
is
one
of
power.
A
State
in
\
\
of
power.
A
State
in
the
grip
of
neo-colonialism
is
not
master
of
its
own
destiny.
\"\
\
the
grip
of
neo-colonialism
is
not
master
of
its
own
destiny.
\"\
n
Kwame
Nkrumah,
\
\
n
Kwame
Nkrumah,
Neo-Colonialism,
1965
\n
Which
of
the
following
provides
the
best
\
\
Neo-Colonialism,
1965
\n
Which
of
the
following
provides
the
best
context
for
Nkrumah's
\
\
context
for
Nkrumah's
writings?
\n
(A)
The
Industrial
Revolution
(B)
Decolonization
\
\
writings?
\n
(A)
The
Industrial
Revolution
(B)
Decolonization
(C)
Regional
Free
\
\
(C)
Regional
Free
Trade
Associations
(D)
Autarky
\n
A:
Let's
think
step
by
step.
\
\
Trade
Associations
(D)
Autarky
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
We
refer
to
Wikipedia
articles
on
world
history
for
help.
The
passage
expresses
\
\
articles
on
world
history
for
help.
The
passage
expresses
a
point
that
the
successful
\
\
a
point
that
the
successful
fight
against
neo-colonialism
were
in
danger
and
the
\
\
fight
against
neo-colonialism
were
in
danger
and
the
newly
independent
nations
\
\
newly
independent
nations
like
Ghana
may
be
re-colonized
via
financial
power
of
\
\
like
Ghana
may
be
re-colonized
via
financial
power
of
the
developed
countries.
\
\
the
developed
countries.
The
answer
is
(B).
\n\n
Q:
This
question
refers
to
the
\
\
The
answer
is
(B).
\n\n
Q:
This
question
refers
to
the
following
information.
\n\"
\
\
following
information.
\n\"
Indeed,
as
both
the
fatwas
of
distinguished
[scholars]
\
Indeed,
as
both
the
fatwas
of
distinguished
[scholars]
who
base
their
opinion
on
\
\
who
base
their
opinion
on
reason
and
tradition
alike
and
the
consensus
of
the
\
\
reason
and
tradition
alike
and
the
consensus
of
the
Sunni
community
agree
that
\
\
Sunni
community
agree
that
the
ancient
obligation
of
extirpation,
extermination,
\
\
the
ancient
obligation
of
extirpation,
extermination,
and
expulsion
of
evil
innovation
\
\
and
expulsion
of
evil
innovation
must
be
the
aim
of
our
exalted
aspiration,
for
\
\
must
be
the
aim
of
our
exalted
aspiration,
for
\"
Religious
zeal
is
a
victory
for
\
\
\"
Religious
zeal
is
a
victory
for
the
Faith
of
God
the
Beneficent
\"
;
then,
in
\
\
the
Faith
of
God
the
Beneficent
\"
;
then,
in
accordance
with
the
words
of
the
Prophet
\
\
accordance
with
the
words
of
the
Prophet
(Peace
upon
him!)
\"
Whosoever
introduces
\
\
(Peace
upon
him!)
\"
Whosoever
introduces
evil
innovation
into
our
order
must
be
\
\
evil
innovation
into
our
order
must
be
expelled
\"
and
\"
Whosoever
does
aught
against
\
\
expelled
\"
and
\"
Whosoever
does
aught
against
our
order
must
be
expelled,
\"
action
\
\
our
order
must
be
expelled,
\"
action
has
become
necessary
and
exigent
\u2026\"\n
\
\
has
become
necessary
and
exigent…
\"\n
Letter
from
Ottoman
Sultan
Selim
I
to
Safavid
\
Letter
from
Ottoman
Sultan
Selim
I
to
Safavid
Shah
Ismail
I,
1514
\n
The
letter
from
\
\
Shah
Ismail
I,
1514
\n
The
letter
from
Selim
I
is
most
clearly
an
example
of
which
\
\
Selim
I
is
most
clearly
an
example
of
which
of
the
following?
\n
(A)
The
maintenance
\
\
of
the
following?
\n
(A)
The
maintenance
of
military
supremacy
at
all
costs
(B)
\
\
of
military
supremacy
at
all
costs
(B)
Expanding
tensions
between
religious
sects
\
\
Expanding
tensions
between
religious
sects
(C)
Factors
that
brought
about
the
\
\
(C)
Factors
that
brought
about
the
collapse
of
the
Ottoman
Empire
(D)
Peacemaking
\
\
collapse
of
the
Ottoman
Empire
(D)
Peacemaking
efforts
among
the
Islamic
empires
\n
\
\
efforts
among
the
Islamic
empires
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
history
for
\
\
articles
on
world
history
for
help.
The
passage
is
an
example
of
expanding
tensions
\
\
help.
The
passage
is
an
example
of
expanding
tensions
between
Selim
and
Ismail.
\
\
between
Selim
and
Ismail.
In
the
passage
the
Selim
references
the
fatwa
and
the
\
\
In
the
passage
the
Selim
references
the
fatwa
and
the
consensus
of
the
Sunni
community
\
\
consensus
of
the
Sunni
community
to
against
whosoever
introduces
evil.
The
answer
\
\
to
against
whosoever
introduces
evil.
The
answer
is
(B).
\n\n
Q:
This
question
refers
\
\
is
(B).
\n\n
Q:
This
question
refers
to
the
following
information.
\n\"
The
real
grievance
\
\
to
the
following
information.
\n\"
The
real
grievance
of
the
worker
is
the
insecurity
\
\
of
the
worker
is
the
insecurity
of
his
existence;
he
is
not
sure
that
he
will
\
\
of
his
existence;
he
is
not
sure
that
he
will
always
have
work,
he
is
not
sure
\
\
always
have
work,
he
is
not
sure
that
he
will
always
be
healthy,
and
he
foresees
\
\
that
he
will
always
be
healthy,
and
he
foresees
that
he
will
one
day
be
old
and
\
\
that
he
will
one
day
be
old
and
unfit
to
work.
If
he
falls
into
poverty,
even
\
\
unfit
to
work.
If
he
falls
into
poverty,
even
if
only
through
a
prolonged
illness,
\
\
if
only
through
a
prolonged
illness,
he
is
then
completely
helpless,
exam_ins
\
\
he
is
then
completely
helpless,
exam_ins
to
his
own
devices,
and
society
does
\
\
to
his
own
devices,
and
society
does
not
currently
recognize
any
real
obligation
\
\
not
currently
recognize
any
real
obligation
towards
him
beyond
the
usual
help
\
\
towards
him
beyond
the
usual
help
for
the
poor,
even
if
he
has
been
working
all
\
\
for
the
poor,
even
if
he
has
been
working
all
the
time
ever
so
faithfully
and
\
\
the
time
ever
so
faithfully
and
diligently.
The
usual
help
for
the
poor,
however,
\
\
diligently.
The
usual
help
for
the
poor,
however,
leaves
a
lot
to
be
desired,
\
\
leaves
a
lot
to
be
desired,
especially
in
large
cities,
where
it
is
very
much
\
\
especially
in
large
cities,
where
it
is
very
much
worse
than
in
the
country.
\"
\
\
worse
than
in
the
country.
\"\
n
Otto
von
Bismarck,
1884
\n
Otto
von
Bismarck
likely
\
\n
Otto
von
Bismarck,
1884
\n
Otto
von
Bismarck
likely
made
this
speech
in
reaction
\
\
made
this
speech
in
reaction
to
which
of
the
following
issues?
\n
(A)
Social
acceptance
\
\
to
which
of
the
following
issues?
\n
(A)
Social
acceptance
of
child
labor
(B)
Declining
\
\
of
child
labor
(B)
Declining
life
expectancy
in
Germany
(C)
Criticisms
of
German
\
\
life
expectancy
in
Germany
(C)
Criticisms
of
German
trade
tariffs
(D)
Negative
\
\
trade
tariffs
(D)
Negative
effects
attributed
to
industrial
capitalism
\n
A:
Let's
\
\
effects
attributed
to
industrial
capitalism
\n
A:
Let's
think
step
by
step.
We
refer
\
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
history
for
help.
\
\
to
Wikipedia
articles
on
world
history
for
help.
The
passage
talks
about
the
grievance
\
\
The
passage
talks
about
the
grievance
of
the
work
under
the
industrial
capitalism.
\
\
of
the
work
under
the
industrial
capitalism.
The
answer
is
(D).
\n\n
Q:
This
question
\
\
The
answer
is
(D).
\n\n
Q:
This
question
refers
to
the
following
information.
\n
\
\
refers
to
the
following
information.
\n
He
contains
all
works
and
desires
and
all
\
He
contains
all
works
and
desires
and
all
perfumes
and
all
tastes.
He
enfolds
the
\
\
perfumes
and
all
tastes.
He
enfolds
the
whole
universe
and
in
silence
is
loving
\
\
whole
universe
and
in
silence
is
loving
to
all.
This
is
the
Spirit
that
is
in
\
\
to
all.
This
is
the
Spirit
that
is
in
my
heart,
this
is
Brahman.
To
him
I
shall
\
\
my
heart,
this
is
Brahman.
T
o
him
I
sha
ll
come
w
he
n
I
go
beyond
this
life,
and
\
\
come
when
I
go
beyond
this
life,
and
t
o
him
wi
ll
come
he
who
has
faith
and
doubts
\
\
to
him
will
come
he
who
has
faith
and
doubts
not.
\n\u2014
The
Upanishads,
India,
\
\
not.
\n
—The
Upanishads,
India,
c.
1000
BCE
\n
To
which
religion
does
the
speaker
\
\
c.
1000
BCE
\n
To
which
religion
does
the
speaker
most
likely
belong?
\n
(A)
Hinduism
\
\
most
likely
belong?
\n
(A)
Hinduism
(B)
Buddhism
(C)
Shintoism
(D)
Zoroastrianism
\n
\
\
(B)
Buddhism
(C)
Shintoism
(D)
Zoroastrianism
\n
A:
Let's
think
step
by
step.
We
\
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
history
for
\
\
refer
to
Wikipedia
articles
on
world
history
for
help.
Brahman
refers
to
the
ultimate
\
\
help.
Brahman
refer
s
to
the
ultimate
reality
of
all
things
in
the
Hindu
religion.
\
\
reality
of
all
things
in
the
Hindu
religion.
In
contrast,
Buddhism
does
not
have
\
\
In
contrast,
Buddhism
does
not
have
a
concept
of
supreme
God.
The
answer
is
(A)."
\
a
concept
of
supreme
God.
The
answer
is
(A).
"
"
group"
:
"
mmlu_flan_cot_fewshot_humanities
"
include
:
_mmlu_flan_cot_fewshot_template_yaml
"
include
"
:
"
_mmlu_flan_cot_fewshot_template_yaml
"
task
:
mmlu_flan_cot_fewshot_high_school_world_history
"
task
"
:
"
mmlu_flan_cot_fewshot_high_school_world_history
"
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_human_aging.yaml
View file @
109ed1c7
dataset_name
:
human_aging
"
dataset_name"
:
"
human_aging"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
human
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
human
\
aging.
\
aging.
\n\n
Q:
All
other
things
being
equal,
which
of
the
following
persons
is
more
\
\
likely
to
show
osteoporosis?
\n
(A)
An
older
Hispanic
American
woman
(B)
An
older
\
\
African
American
woman
(C)
An
older
Asian
American
woman
(D)
An
older
Native
American
\
Q:
All
other
things
being
equal,
which
of
the
following
persons
is
more
likely
to
\
woman
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
\
show
osteoporosis?
\
for
help.
Although
osteoporosis
can
occur
at
any
age,
the
risk
is
higher
for
older
\
\
people.
It
is
most
common
in
Asian
and
non-Hispanic
white
women.
The
answer
is
\
(A)
An
older
Hispanic
American
woman
(B)
An
older
African
American
woman
(C)
An
\
(C).
\n\n
Q:
The
finding
that
adults
tend
to
remember
events
from
their
adolescence
\
older
Asian
American
woman
(D)
An
older
Native
American
woman
\
better
than
from
other
periods
in
their
lives
is
referred
to
as
the
\n
(A)
Adolescence
\
\
advantage
(B)
Reminiscence
bump
(C)
Memorial
memorial
(D)
Quadratic
retrieval
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
\
spike
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
\
help.
Although
osteoporosis
can
occur
at
any
age,
the
risk
is
higher
for
older
people.
\
for
help.
Reminiscence
bump
is
a
phenomenon
that
older
adults
tend
to
recollect
\
It
is
most
common
in
Asian
and
non-Hispanic
white
women.
The
answer
is
(C).
\
events
during
their
young
ages.
People
usually
have
a
period
of
childhood
amnesia
\
\
from
birth
to
around
age
5,
and
a
reminiscence
bump
between
10
and
30.
The
answer
\
\
is
(B).
\n\n
Q:
Which
element
in
tobacco
smoke
is
responsible
for
cancers?
\n
(A)
\
Q:
The
finding
that
adults
tend
to
remember
events
from
their
adolescence
better
\
Nicotine
(B)
Tar
(C)
Carbon
monoxide
(D)
Smoke
particles
\n
A:
Let's
think
step
\
than
from
other
periods
in
their
lives
is
referred
to
as
the
\
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
help.
The
benzene,
\
\
acrylamide
and
acrylonitrile
in
tar
interact
with
the
lungs
and
cause
DNA
mutations
\
(A)
Adolescence
advantage
(B)
Reminiscence
bump
(C)
Memorial
memorial
(D)
Quadratic
\
in
cells
of
the
lungs,
and
lead
to
cancer.
The
answer
is
(B).
\n\n
Q:
When
older
\
retrieval
spike
\
adults
move
to
a
new
state
after
retirement,
which
of
the
following
is
the
more
\
\
likely
destination?
\n
(A)
Texas
(B)
California
(C)
Hawaii
(D)
Vermont
\n
A:
Let's
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
help.
Texas
\
help.
Reminiscence
bump
is
a
phenomenon
that
older
adults
tend
to
recollect
events
\
does
not
have
state
tax,
and
has
low
cost
of
living
compared
with
the
other
three
\
during
their
young
ages.
People
usually
have
a
period
of
childhood
amnesia
from
\
options.
The
answer
is
(A)."
birth
to
around
age
5,
and
a
reminiscence
bump
between
10
and
30.
The
answer
is
"
group"
:
"
mmlu_flan_cot_fewshot_other"
(B).
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_human_aging"
Q:
Which
element
in
tobacco
smoke
is
responsible
for
cancers?
(A)
Nicotine
(B)
Tar
(C)
Carbon
monoxide
(D)
Smoke
particles
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
help.
The
benzene,
acrylamide
and
acrylonitrile
in
tar
interact
with
the
lungs
and
cause
DNA
mutations
in
cells
of
the
lungs,
and
lead
to
cancer.
The
answer
is
(B).
Q:
When
older
adults
move
to
a
new
state
after
retirement,
which
of
the
following
is
the
more
likely
destination?
(A)
Texas
(B)
California
(C)
Hawaii
(D)
Vermont
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
aging
for
help.
Texas
does
not
have
state
tax,
and
has
low
cost
of
living
compared
with
the
other
three
options.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_human_aging
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_human_sexuality.yaml
View file @
109ed1c7
dataset_name
:
human_sexuality
"
dataset_name"
:
"
human_sexuality"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
human
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
human
\
sexuality.
\
sexuality.
\n\n
Q:
The
most
common
disorder
among
men
who
seek
sexual
therapy
is:
\n\
(A)
premature
ejaculation
(B)
inhibited
ejaculation
(C)
erectile
disorder
(D)
ejaculatory
\
\
disorder
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
\
Q:
The
most
common
disorder
among
men
who
seek
sexual
therapy
is:
\
sexuality
for
help.
The
most
common
type
of
arousal
disorder
for
men
is
erectile
\
\
dysfunction,
meaning
that
a
person
is
not
able
to
get
any
physical
satisfaction
\
(A)
premature
ejaculation
(B)
inhibited
ejaculation
(C)
erectile
disorder
(D)
ejaculatory
\
from
sexual
activity
although
he
may
be
interested
in
it.
The
answer
is
(C).
\n\
disorder
\n
Q:
A
woman
who
knows
she
has
active
herpes
and
untreated
syphilis
but
continues
\
\
to
have
sex
without
informing
her
partners
of
her
condition
has,
in
psychoanalytic
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
\
terms:
\n
(A)
a
strong
ego
(B)
a
weak
superego
(C)
a
weak
id
(D)
a
strong
superego
\n\
for
help.
The
most
common
type
of
arousal
disorder
for
men
is
erectile
dysfunction,
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
\
meaning
that
a
person
is
not
able
to
get
any
physical
satisfaction
from
sexual
activity
\
help.
A
person
with
weak
superego
tends
to
be
delinquent,
criminal
or
have
antisocial
\
although
he
may
be
interested
in
it.
The
answer
is
(C).
\
personality.
The
action
of
the
woman
who
knows
she
has
active
venereal
disease
\
\
but
still
have
sex
with
her
partners
indicate
she
may
has
antisocial
personality.
\
\
The
answer
is
(B).
\n\n
Q:
The
nature
of
homosexual
activities
that
occur
during
\
Q:
A
woman
who
knows
she
has
active
herpes
and
untreated
syphilis
but
continues
\
preadolescence
include
all
but
which
of
the
following?
\n
(A)
sexual
intercourse
\
to
have
sex
without
informing
her
partners
of
her
condition
has,
in
psychoanalytic
\
(B)
circle
jerks
(C)
exhibitionism
(D)
touching
each
other's
genitals
\n
A:
Let's
\
terms:
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
help.
\
\
Sexual
intercourse
is
prohibited
during
preadolescence.
The
answer
is
(A).
\n\n\
(A)
a
strong
ego
(B)
a
weak
superego
(C)
a
weak
id
(D)
a
strong
superego
Q:
Women's
ability
to
have
multiple
orgasms
is
primarily
due
to:
\n
(A)
the
fact
that
\
\
they
do
not
have
a
refractory
period.
(B)
the
response
of
the
inner
layers
of
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
\
the
vagina.
(C)
having
alternating
orgasms
in
different
locations.
(D)
the
G-Spot.
\n\
for
help.
A
person
with
weak
superego
tends
to
be
delinquent,
criminal
or
have
antisocial
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
\
personality.
The
action
of
the
woman
who
knows
she
has
active
venereal
disease
but
\
help.
The
refractory
period
is
the
time
when
a
person
is
not
able
to
be
erect
\
still
have
sex
with
her
partners
indicate
she
may
has
antisocial
personality.
The
\
or
is
not
interested
in
sex.
The
answer
is
(A).
\n\n
Q:
Morning
sickness
is
typically
\
answer
is
(B).
\
a
problem:
\n
(A)
during
the
first
trimester
(B)
during
the
second
trimester
(C)
\
\
during
the
third
trimester
(D)
all
through
the
pregnancy
\n
A:
Let's
think
step
\
\
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
help.
Morning
sickness
\
Q:
The
nature
of
homosexual
activities
that
occur
during
preadolescence
include
\
usually
begins
by
nine
weeks
after
conception,
corresponding
to
the
first
trimester.
\
all
but
which
of
the
following?
\
The
answer
is
(A)."
"
group"
:
"
mmlu_flan_cot_fewshot_social_sciences"
(A)
sexual
intercourse
(B)
circle
jerks
(C)
exhibitionism
(D)
touching
each
other'
'
s
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
genitals
"
task"
:
"
mmlu_flan_cot_fewshot_human_sexuality"
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
help.
Sexual
intercourse
is
prohibited
during
preadolescence.
The
answer
is
(A).
Q:
Women'
'
s
ability
to
have
multiple
orgasms
is
primarily
due
to:
(A)
the
fact
that
they
do
not
have
a
refractory
period.
(B)
the
response
of
the
inner
layers
of
the
vagina.
(C)
having
alternating
orgasms
in
different
locations.
(D)
the
G-Spot.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
help.
The
refractory
period
is
the
time
when
a
person
is
not
able
to
be
erect
or
is
not
interested
in
sex.
The
answer
is
(A).
Q:
Morning
sickness
is
typically
a
problem:
(A)
during
the
first
trimester
(B)
during
the
second
trimester
(C)
during
the
third
trimester
(D)
all
through
the
pregnancy
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
human
sexuality
for
help.
Morning
sickness
usually
begins
by
nine
weeks
after
conception,
corresponding
to
the
first
trimester.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_human_sexuality
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_international_law.yaml
View file @
109ed1c7
dataset_name
:
international_law
"
dataset_name"
:
"
international_law"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
international
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
international
\
law.
\
law.
\n\n
Q:
How
the
consent
to
be
bound
of
a
State
may
be
expressed?
\n
(A)
The
consent
\
\
of
a
State
to
be
bound
is
expressed
only
by
ratification
(B)
The
consent
of
a
\
\
state
to
be
bound
by
a
treaty
may
be
expressed
by
signature,
ratification,
acceptance,
\
Q:
How
the
consent
to
be
bound
of
a
State
may
be
expressed?
\
approval
or
accession
(C)
The
consent
of
a
State
to
be
bound
is
expressed
by
signature
\
\
(D)
The
consent
of
a
State
to
be
bound
is
expressed
by
whatever
means
they
choose
\n\
(A)
The
consent
of
a
State
to
be
bound
is
expressed
only
by
ratification
(B)
The
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
\
consent
of
a
state
to
be
bound
by
a
treaty
may
be
expressed
by
signature,
ratification,
\
for
help.
Article
11
of
Vienna
Convention
on
the
Law
of
Treaties
signed
in
1969
\
acceptance,
approval
or
accession
(C)
The
consent
of
a
State
to
be
bound
is
expressed
\
states
that
\"
the
consent
of
a
State
to
be
bound
by
a
treaty
may
be
expressed
\
by
signature
(D)
The
consent
of
a
State
to
be
bound
is
expressed
by
whatever
means
\
by
signature,
exchange
of
instruments
constituting
a
treaty,
ratification,
acceptance,
\
they
choose
\
approval
or
accession,
or
by
any
other
means
if
so
agreed.
\"
(B)
is
the
most
precise
\
\
and
accurate
answer.
The
answer
is
(B).
\n\n
Q:
What
is
the
judge
ad
hoc?
\n
(A)
If
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
\
a
party
to
a
contentious
case
before
the
ICJ
does
not
have
a
national
sitting
\
for
help.
Article
11
of
Vienna
Convention
on
the
Law
of
Treaties
signed
in
1969
\
as
judge,
it
is
entitled
to
nominate
someone
as
a
judge
solely
for
that
case,
\
states
that
"the
consent
of
a
State
to
be
bound
by
a
treaty
may
be
expressed
by
\
with
the
title
of
judge
ad
hoc
(B)
Judge
ad
hoc
is
the
member
of
the
bench
of
\
signature,
exchange
of
instruments
constituting
a
treaty,
ratification,
acceptance,
\
the
ICJ
with
a
casting
vote
(C)
Judge
ad
hoc
is
a
surrogate
judge,
in
case
a
judge
\
approval
or
accession,
or
by
any
other
means
if
so
agreed."
(B)
is
the
most
precise
\
is
disqualified
or
passes
away
(D)
Judge
ad
hoc
is
the
judge
that
each
party
will
\
and
accurate
answer.
The
answer
is
(B).
\
always
nominate
in
every
contentious
case
\n
A:
Let's
think
step
by
step.
We
refer
\
\
to
Wikipedia
articles
on
international
law
for
help.
As
\"
ad
hoc
\"
implies,
a
\
\
judge
ad
hoc
is
appointed
only
for
a
specific
case
or
period,
when
a
party
to
\
Q:
What
is
the
judge
ad
hoc?
\
a
contentious
case
before
the
International
Court
of
Justice
does
not
have
a
regular
\
\
national
sitting
as
judge.
The
answer
is
(A).
\n\n
Q:
When
'consent'
can
serve
as
\
(A)
If
a
party
to
a
contentious
case
before
the
ICJ
does
not
have
a
national
sitting
\
a
circumstance
precluding
the
wrongfulness
of
a
State
conduct?
\n
(A)
Consent
can
\
as
judge,
it
is
entitled
to
nominate
someone
as
a
judge
solely
for
that
case,
with
\
serve
as
a
circumstance
precluding
the
wrongfulness
whenever
it
is
given
(B)
Consent
\
the
title
of
judge
ad
hoc
(B)
Judge
ad
hoc
is
the
member
of
the
bench
of
the
ICJ
\
can
never
serve
as
a
circumstance
precluding
wrongfulness
(C)
Consent
can
serve
\
with
a
casting
vote
(C)
Judge
ad
hoc
is
a
surrogate
judge,
in
case
a
judge
is
disqualified
\
as
a
circumstance
precluding
wrongfulness,
provided
the
consent
is
valid
and
to
\
or
passes
away
(D)
Judge
ad
hoc
is
the
judge
that
each
party
will
always
nominate
\
the
extent
that
the
conduct
remains
within
the
limits
of
the
consent
given
(D)
\
in
every
contentious
case
\
Consent
can
always
serve
as
a
circumstance
precluding
wrongfulness,
no
matter
\
\
which
organ
of
the
State
gives
it
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
\
articles
on
international
law
for
help.
Valid
consent
can
serve
as
a
circumstance
\
for
help.
As
"ad
hoc"
implies,
a
judge
ad
hoc
is
appointed
only
for
a
specific
case
\
precluding
the
wrongfulness
of
a
State
conduct
if
the
conduct
remains
within
the
\
or
period,
when
a
party
to
a
contentious
case
before
the
International
Court
of
\
limits
of
that
consent,
according
to
Chapter
V
of
the
Responsibility
of
States
\
Justice
does
not
have
a
regular
national
sitting
as
judge.
The
answer
is
(A).
\
for
Internationally
Wrongful
Acts,
2001,
United
Nations.
The
answer
is
(C).
\n\n\
Q:
Would
a
reservation
to
the
definition
of
torture
in
the
ICCPR
be
acceptable
in
\
\
contemporary
practice?
\n
(A)
This
is
an
acceptable
reservation
if
the
reserving
\
Q:
When
'
'
consent'
'
can
serve
as
a
circumstance
precluding
the
wrongfulness
of
a
\
country's
legislation
employs
a
different
definition
(B)
This
is
an
unacceptable
\
State
conduct?
\
reservation
because
it
contravenes
the
object
and
purpose
of
the
ICCPR
(C)
This
\
\
is
an
unacceptable
reservation
because
the
definition
of
torture
in
the
ICCPR
\
(A)
Consent
can
serve
as
a
circumstance
precluding
the
wrongfulness
whenever
it
\
is
consistent
with
customary
international
law
(D)
This
is
an
acceptable
reservation
\
is
given
(B)
Consent
can
never
serve
as
a
circumstance
precluding
wrongfulness
(C)
\
because
under
general
international
law
States
have
the
right
to
enter
reservations
\
Consent
can
serve
as
a
circumstance
precluding
wrongfulness,
provided
the
consent
\
to
treaties
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
\
is
valid
and
to
the
extent
that
the
conduct
remains
within
the
limits
of
the
consent
\
law
for
help.
For
it
contravenes
the
object
and
purpose
of
the
ICCPR,
this
is
\
given
(D)
Consent
can
always
serve
as
a
circumstance
precluding
wrongfulness,
no
\
an
unacceptable
reservation
in
contemporary
practice.
The
answer
is
(B).
\n\n
Q:
\
matter
which
organ
of
the
State
gives
it
\
What
types
of
force
does
Article
2(4)
of
the
UN
Charter
prohibit?
\n
(A)
Article
\
\
2(4)
encompasses
only
armed
force
(B)
Article
2(4)
encompasses
all
types
of
force,
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
\
including
sanctions
(C)
Article
2(4)
encompasses
all
interference
in
the
domestic
\
for
help.
Valid
consent
can
serve
as
a
circumstance
precluding
the
wrongfulness
\
affairs
of
States
(D)
Article
2(4)
encompasses
force
directed
only
against
a
State's
\
of
a
State
conduct
if
the
conduct
remains
within
the
limits
of
that
consent,
according
\
territorial
integrity
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
to
Chapter
V
of
the
Responsibility
of
States
for
Internationally
Wrongful
Acts,
\
on
international
law
for
help.
Article
2(4)
of
the
UN
Charter
prohibits
states
\
2001,
United
Nations.
The
answer
is
(C).
\
from
using
armed
forces
in
their
international
relations.
The
answer
is
(A)."
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
Q:
Would
a
reservation
to
the
definition
of
torture
in
the
ICCPR
be
acceptable
in
"
task"
:
"
mmlu_flan_cot_fewshot_international_law"
contemporary
practice?
(A)
This
is
an
acceptable
reservation
if
the
reserving
country'
'
s
legislation
employs
a
different
definition
(B)
This
is
an
unacceptable
reservation
because
it
contravenes
the
object
and
purpose
of
the
ICCPR
(C)
This
is
an
unacceptable
reservation
because
the
definition
of
torture
in
the
ICCPR
is
consistent
with
customary
international
law
(D)
This
is
an
acceptable
reservation
because
under
general
international
law
States
have
the
right
to
enter
reservations
to
treaties
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
for
help.
For
it
contravenes
the
object
and
purpose
of
the
ICCPR,
this
is
an
unacceptable
reservation
in
contemporary
practice.
The
answer
is
(B).
Q:
What
types
of
force
does
Article
2(4)
of
the
UN
Charter
prohibit?
(A)
Article
2(4)
encompasses
only
armed
force
(B)
Article
2(4)
encompasses
all
types
of
force,
including
sanctions
(C)
Article
2(4)
encompasses
all
interference
in
the
domestic
affairs
of
States
(D)
Article
2(4)
encompasses
force
directed
only
against
a
State'
'
s
territorial
integrity
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
international
law
for
help.
Article
2(4)
of
the
UN
Charter
prohibits
states
from
using
armed
forces
in
their
international
relations.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_international_law
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_jurisprudence.yaml
View file @
109ed1c7
dataset_name
:
jurisprudence
"
dataset_name"
:
"
jurisprudence"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
jurisprudence.
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
jurisprudence.
\n\
\n
Q:
Iverson
Jewelers
wrote
a
letter
to
Miller,
'We
have
received
an
exceptionally
\
\
fine
self
winding
Rolox
watch
which
we
will
sell
to
you
at
a
very
favorable
price.'
\n\
Q:
Iverson
Jewelers
wrote
a
letter
to
Miller,
'
'
We
have
received
an
exceptionally
(A)
The
letter
is
an
offer
to
sell
(B)
A
valid
offer
cannot
be
made
by
letter.
(C)
\
fine
self
winding
Rolox
watch
which
we
will
sell
to
you
at
a
very
favorable
price.'
'
\
The
letter
contains
a
valid
offer
which
will
terminate
within
a
reasonable
time.
\
\
(D)
The
letter
lacks
one
of
the
essential
elements
of
an
offer.
\n
A:
Let's
think
\
(A)
The
letter
is
an
offer
to
sell
(B)
A
valid
offer
cannot
be
made
by
letter.
(C)
\
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
help.
An
offer
\
The
letter
contains
a
valid
offer
which
will
terminate
within
a
reasonable
time.
\
shows
the
intent
to
enter
into
a
mutually-beneficial
contract
with
specific
terms.
\
(D)
The
letter
lacks
one
of
the
essential
elements
of
an
offer.
\
An
offer
can
be
made
by
a
letter.
While
this
letter
indicates
the
willingness
\
\
to
sell,
the
lack
of
specific
terms,
such
as
transaction
price
and
offer
expiration
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
\
date,
makes
it
an
incomplete
offer.
The
answer
is
(D).
\n\n
Q:
Functions
of
the
\
help.
An
offer
shows
the
intent
to
enter
into
a
mutually-beneficial
contract
with
\
law
include
all
but
which
of
the
following?
\n
(A)
maximizing
individual
freedom
\
specific
terms.
An
offer
can
be
made
by
a
letter.
While
this
letter
indicates
the
\
(B)
providing
a
basis
for
compromise
(C)
keeping
the
peace
(D)
promoting
the
principles
\
willingness
to
sell,
the
lack
of
specific
terms,
such
as
transaction
price
and
offer
\
of
the
free
enterprise
system
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
expiration
date,
makes
it
an
incomplete
offer.
The
answer
is
(D).
\
articles
on
jurisprudence
for
help.
Laws
are
fundamentally
about
helping
resolve
\
\
disputes
between
individuals,
and
therefore
essential
for
maximizing
individual
\
\
freedom,
providing
a
basis
for
compromise,
and
keeping
the
peace.
The
answer
is
\
Q:
Functions
of
the
law
include
all
but
which
of
the
following?
\
(D).
\n\n
Q:
The
________
School
of
jurisprudence
postulates
that
the
law
is
based
\
\
on
what
is
\"
correct.
\"\n
(A)
Natural
Law
(B)
Analytical
(C)
Historical
(D)
Sociological
\n\
(A)
maximizing
individual
freedom
(B)
providing
a
basis
for
compromise
(C)
keeping
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
\
the
peace
(D)
promoting
the
principles
of
the
free
enterprise
system
\
help.
Natural
Law
School
of
jurisprudence
focuses
on
the
laws
of
nature,
and
states
\
\
that
the
law
should
be
based
on
ethics,
morals,
and
what
is
\"
correct
\"
.
Analytical
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
\
deals
with
the
law
as
it
already
exists,
Historical
postulates
that
the
law
was
\
help.
Laws
are
fundamentally
about
helping
resolve
disputes
between
individuals,
\
found
and
not
made,
and
Sociological
studies
how
the
law
and
society
impact
each
\
and
therefore
essential
for
maximizing
individual
freedom,
providing
a
basis
for
\
other.
The
answer
is
(A).
\n\n
Q:
Which
word
best
summarizes
Weber's
explanation
\
compromise,
and
keeping
the
peace.
The
answer
is
(D).
\
of
the
development
of
formally
rational
law?
\n
(A)
Authority.
(B)
Charisma.
(C)
\
\
Co-operation.
(D)
Capitalism.
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
jurisprudence
for
help.
Weber
explained
the
development
of
formal
\
Q:
The
________
School
of
jurisprudence
postulates
that
the
law
is
based
on
what
\
rationality
in
laws
as
how
the
modern
society
moved
from
tradition
to
rationality,
\
is
"correct."
\
where
people
decide
actions
based
less
on
how
they
were
culturally
done
and
more
\
\
on
expected
utilities.
How
rational
individuals
optimize
efficiency
of
accomplishing
\
(A)
Natural
Law
(B)
Analytical
(C)
Historical
(D)
Sociological
\
tasks
for
higher
rewards
is
a
core
principle
of
Capitalism.
The
answer
is
(D).
\n\
\n
Q:
Which
position
does
Rawls
claim
is
the
least
likely
to
be
adopted
by
the
POP
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
\
(people
in
the
original
position)?
\n
(A)
The
POP
would
choose
equality
above
liberty.
\
help.
Natural
Law
School
of
jurisprudence
focuses
on
the
laws
of
nature,
and
states
\
(B)
The
POP
would
opt
for
the
'maximin'
strategy.
(C)
The
POP
would
opt
for
the
\
that
the
law
should
be
based
on
ethics,
morals,
and
what
is
"correct".
Analytical
\
'difference
principle'.
(D)
The
POP
would
reject
the
'system
of
natural
liberty.'
\n\
deals
with
the
law
as
it
already
exists,
Historical
postulates
that
the
law
was
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
\
found
and
not
made,
and
Sociological
studies
how
the
law
and
society
impact
each
\
help.
The
POP
would
opt
for
the
'maximin'
strategy,
opt
for
the
'difference
principle',
\
other.
The
answer
is
(A).
\
and
reject
the
'system
of
natural
liberty',
but
the
POP
would
not
choose
equality
\
\
above
liberty,
since
the
POP
assume
both
equal
and
free
citizens.
The
answer
is
\
\
(A)."
Q:
Which
word
best
summarizes
Weber'
'
s
explanation
of
the
development
of
formally
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
rational
law?
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_fewshot_jurisprudence"
(A)
Authority.
(B)
Charisma.
(C)
Co-operation.
(D)
Capitalism.
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
help.
Weber
explained
the
development
of
formal
rationality
in
laws
as
how
the
modern
society
moved
from
tradition
to
rationality,
where
people
decide
actions
based
less
on
how
they
were
culturally
done
and
more
on
expected
utilities.
How
rational
individuals
optimize
efficiency
of
accomplishing
tasks
for
higher
rewards
is
a
core
principle
of
Capitalism.
The
answer
is
(D).
Q:
Which
position
does
Rawls
claim
is
the
least
likely
to
be
adopted
by
the
POP
(people
in
the
original
position)?
(A)
The
POP
would
choose
equality
above
liberty.
(B)
The
POP
would
opt
for
the
'
'
maximin'
'
strategy.
(C)
The
POP
would
opt
for
the
'
'
difference
principle'
'
.
(D)
The
POP
would
reject
the
'
'
system
of
natural
liberty.'
'
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
jurisprudence
for
help.
The
POP
would
opt
for
the
'
'
maximin'
'
strategy,
opt
for
the
'
'
difference
principle'
'
,
and
reject
the
'
'
system
of
natural
liberty'
'
,
but
the
POP
would
not
choose
equality
above
liberty,
since
the
POP
assume
both
equal
and
free
citizens.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_jurisprudence
Prev
1
2
3
4
5
6
…
12
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment