Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
cd8642e7
Commit
cd8642e7
authored
Jul 26, 2024
by
Yu Shi Jie
Browse files
mmlu-pro: fixed yaml formatting
parent
cd0983b8
Changes
40
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
31 additions
and
31 deletions
+31
-31
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_business.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_business.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_computer_science.yaml
...asks/mmlu_pro/continuation/mmlu_pro_computer_science.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_economics.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_economics.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_engineering.yaml
...val/tasks/mmlu_pro/continuation/mmlu_pro_engineering.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_health.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_health.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_history.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_history.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_law.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_law.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_math.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_math.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_miscellaneous.yaml
...l/tasks/mmlu_pro/continuation/mmlu_pro_miscellaneous.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_philosophy.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_philosophy.yaml
+2
-2
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_physics.yaml
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_physics.yaml
+2
-2
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_biology.yaml
...val/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_biology.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_business.yaml
...al/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_business.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_chemistry.yaml
...l/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_chemistry.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_computer_science.yaml
.../mmlu_pro/flan_cot_fewshot/mmlu_pro_computer_science.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_economics.yaml
...l/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_economics.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_engineering.yaml
...tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_engineering.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_health.yaml
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_health.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_history.yaml
...val/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_history.yaml
+1
-1
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_law.yaml
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_law.yaml
+1
-1
No files found.
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_business.yaml
View file @
cd8642e7
"
dataset_name"
:
"
business"
"
dataset_name"
:
"
business"
"
description"
:
"
The
following
are
questions
(with
answers)
about
business.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
business.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_other"
"
tag
"
:
"
mmlu_
pro_
continuation_other"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_business"
"
task"
:
"
mmlu_
pro_
continuation_business"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_computer_science.yaml
View file @
cd8642e7
"
dataset_name"
:
"
computer_science"
"
dataset_name"
:
"
computer_science"
"
description"
:
"
The
following
are
questions
(with
answers)
about
computer_science.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
computer_science.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_stem"
"
tag
"
:
"
mmlu_
pro_
continuation_stem"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_computer_science"
"
task"
:
"
mmlu_
pro_
continuation_computer_science"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_economics.yaml
View file @
cd8642e7
"
dataset_name"
:
"
economics"
"
dataset_name"
:
"
economics"
"
description"
:
"
The
following
are
questions
(with
answers)
about
economics.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
economics.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_social_sciences"
"
tag
"
:
"
mmlu_
pro_
continuation_social_sciences"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_economics"
"
task"
:
"
mmlu_
pro_
continuation_economics"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_engineering.yaml
View file @
cd8642e7
"
dataset_name"
:
"
engineering"
"
dataset_name"
:
"
engineering"
"
description"
:
"
The
following
are
questions
(with
answers)
about
engineering.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
engineering.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_stem"
"
tag
"
:
"
mmlu_
pro_
continuation_stem"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_engineering"
"
task"
:
"
mmlu_
pro_
continuation_engineering"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_health.yaml
View file @
cd8642e7
"
dataset_name"
:
"
health"
"
dataset_name"
:
"
health"
"
description"
:
"
The
following
are
questions
(with
answers)
about
health.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
health.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_other"
"
tag
"
:
"
mmlu_
pro_
continuation_other"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_health"
"
task"
:
"
mmlu_
pro_
continuation_health"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_history.yaml
View file @
cd8642e7
"
dataset_name"
:
"
history"
"
dataset_name"
:
"
history"
"
description"
:
"
The
following
are
questions
(with
answers)
about
history.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
history.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_humanities"
"
tag
"
:
"
mmlu_
pro_
continuation_humanities"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_history"
"
task"
:
"
mmlu_
pro_
continuation_history"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_law.yaml
View file @
cd8642e7
"
dataset_name"
:
"
law"
"
dataset_name"
:
"
law"
"
description"
:
"
The
following
are
questions
(with
answers)
about
law.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
law.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_humanities"
"
tag
"
:
"
mmlu_
pro_
continuation_humanities"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_law"
"
task"
:
"
mmlu_
pro_
continuation_law"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_math.yaml
View file @
cd8642e7
"
dataset_name"
:
"
math"
"
dataset_name"
:
"
math"
"
description"
:
"
The
following
are
questions
(with
answers)
about
math.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
math.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_stem"
"
tag
"
:
"
mmlu_
pro_
continuation_stem"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_math"
"
task"
:
"
mmlu_
pro_
continuation_math"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_miscellaneous.yaml
View file @
cd8642e7
"
dataset_name"
:
"
other"
"
dataset_name"
:
"
other"
"
description"
:
"
The
following
are
questions
(with
answers)
about
other.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
other.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_other"
"
tag
"
:
"
mmlu_
pro_
continuation_other"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_miscellaneous"
"
task"
:
"
mmlu_
pro_
continuation_miscellaneous"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_philosophy.yaml
View file @
cd8642e7
"
dataset_name"
:
"
philosophy"
"
dataset_name"
:
"
philosophy"
"
description"
:
"
The
following
are
questions
(with
answers)
about
philosophy.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
philosophy.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_humanities"
"
tag
"
:
"
mmlu_
pro_
continuation_humanities"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_philosophy"
"
task"
:
"
mmlu_
pro_
continuation_philosophy"
lm_eval/tasks/mmlu_pro/continuation/mmlu_pro_physics.yaml
View file @
cd8642e7
"
dataset_name"
:
"
physics"
"
dataset_name"
:
"
physics"
"
description"
:
"
The
following
are
questions
(with
answers)
about
physics.
\n\
"
description"
:
"
The
following
are
questions
(with
answers)
about
physics.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_continuation_stem"
"
tag
"
:
"
mmlu_
pro_
continuation_stem"
"
include"
:
"
_continuation_template_yaml"
"
include"
:
"
_continuation_template_yaml"
"
task"
:
"
mmlu_continuation_physics"
"
task"
:
"
mmlu_
pro_
continuation_physics"
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_biology.yaml
View file @
cd8642e7
...
@@ -13,6 +13,6 @@ fewshot_config:
...
@@ -13,6 +13,6 @@ fewshot_config:
target
:
"
Let's
think
step
by
step.
The
introduction
of
foreign
DNA
or
RNA
into
bacteria
or
eukaryotic
cells
is
a
common
technique
in
molecular
biology
and
scientific
research.
There
are
multiple
ways
foreign
DNA
can
be
introduced
into
cells
including
transformation,
transduction,
conjugation,
and
transfection.
In
contrast,
(A)
is
not
a
way
to
form
DNA:
during
translation
the
ribosomes
synthesize
proteins
from
RNA.
The
answer
is
(A)."
target
:
"
Let's
think
step
by
step.
The
introduction
of
foreign
DNA
or
RNA
into
bacteria
or
eukaryotic
cells
is
a
common
technique
in
molecular
biology
and
scientific
research.
There
are
multiple
ways
foreign
DNA
can
be
introduced
into
cells
including
transformation,
transduction,
conjugation,
and
transfection.
In
contrast,
(A)
is
not
a
way
to
form
DNA:
during
translation
the
ribosomes
synthesize
proteins
from
RNA.
The
answer
is
(A)."
-
question
:
"
Which
of
the
following
is
not
known
to
be
involved
in
the
control
of
cell
division?
(A)
Microtubules
(B)
Checkpoints
(C)
DNA
polymerase
(D)
Centrosomes
(E)
Cyclins
(F)
Mitochondria
(G)
Protein
kinases
(H)
Fibroblast
cells
(I)
N/A
(J)
N/A"
-
question
:
"
Which
of
the
following
is
not
known
to
be
involved
in
the
control
of
cell
division?
(A)
Microtubules
(B)
Checkpoints
(C)
DNA
polymerase
(D)
Centrosomes
(E)
Cyclins
(F)
Mitochondria
(G)
Protein
kinases
(H)
Fibroblast
cells
(I)
N/A
(J)
N/A"
target
:
"
Let's
think
step
by
step.
Normal
cells
move
through
the
cell
cycle
in
a
regulated
way.
At
the
checkpoint
stage,
they
use
information
about
their
own
internal
state
and
cues
from
the
environment
around
them
to
decide
whether
to
proceed
with
cell
division.
Cues
like
these
act
by
changing
the
activity
of
core
cell
cycle
regulators
inside
the
cell.
The
most
common
regulators
are
cyclins
and
cyclin-dependent
kinases.
Fibroblast
cells
do
not
play
any
role
in
cell
division.
The
answer
is
(H)."
target
:
"
Let's
think
step
by
step.
Normal
cells
move
through
the
cell
cycle
in
a
regulated
way.
At
the
checkpoint
stage,
they
use
information
about
their
own
internal
state
and
cues
from
the
environment
around
them
to
decide
whether
to
proceed
with
cell
division.
Cues
like
these
act
by
changing
the
activity
of
core
cell
cycle
regulators
inside
the
cell.
The
most
common
regulators
are
cyclins
and
cyclin-dependent
kinases.
Fibroblast
cells
do
not
play
any
role
in
cell
division.
The
answer
is
(H)."
group
:
mmlu_pro_flan_cot_fewshot_stem
tag
:
mmlu_pro_flan_cot_fewshot_stem
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_biology
task
:
mmlu_pro_flan_cot_fewshot_biology
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_business.yaml
View file @
cd8642e7
...
@@ -18,6 +18,6 @@ fewshot_config:
...
@@ -18,6 +18,6 @@ fewshot_config:
-
question
:
"
In
an
organization,
the
group
of
people
tasked
with
buying
decisions
is
referred
to
as
the
_______________.
-
question
:
"
In
an
organization,
the
group
of
people
tasked
with
buying
decisions
is
referred
to
as
the
_______________.
(A)
Procurement
centre.
(B)
Chief
executive
unit.
(C)
Resources
allocation
group.
(D)
Marketing
department.
(E)
Purchasing
department.
(F)
Supply
chain
management
team.
(G)
Outsourcing
unit.
(H)
Decision-making
unit.
(I)
Operations
unit.
(J)
Financial
management
team."
(A)
Procurement
centre.
(B)
Chief
executive
unit.
(C)
Resources
allocation
group.
(D)
Marketing
department.
(E)
Purchasing
department.
(F)
Supply
chain
management
team.
(G)
Outsourcing
unit.
(H)
Decision-making
unit.
(I)
Operations
unit.
(J)
Financial
management
team."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
marketing
for
help.
In
an
organization,
the
group
of
the
people
tasked
with
buying
decision
is
referred
to
as
the
decision-making
unit.
The
answer
is
(H)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
marketing
for
help.
In
an
organization,
the
group
of
the
people
tasked
with
buying
decision
is
referred
to
as
the
decision-making
unit.
The
answer
is
(H)."
group
:
mmlu_pro_flan_cot_fewshot_other
tag
:
mmlu_pro_flan_cot_fewshot_other
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_business
task
:
mmlu_pro_flan_cot_fewshot_business
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_chemistry.yaml
View file @
cd8642e7
...
@@ -18,6 +18,6 @@ fewshot_config:
...
@@ -18,6 +18,6 @@ fewshot_config:
-
question
:
"
A
solution
contains
2.00
mole
of
acetic
acid,
CH3COOH,
and
1.00
mole
of
calcium
acetate,
Ca(CH3COO)2.
The
solution
is
able
to
resist
the
addition
of
a
small
amount
of
strong
acid
or
strong
base
with
only
minor
changes
in
the
pH
of
the
solution.
Larger
quantities
of
strong
acid
or
strong
base
can
cause
a
significant
change
in
pH.
How
many
moles
of
nitric
acid,
HNO3,
may
be
added
before
the
pH
begins
to
change
significantly?
-
question
:
"
A
solution
contains
2.00
mole
of
acetic
acid,
CH3COOH,
and
1.00
mole
of
calcium
acetate,
Ca(CH3COO)2.
The
solution
is
able
to
resist
the
addition
of
a
small
amount
of
strong
acid
or
strong
base
with
only
minor
changes
in
the
pH
of
the
solution.
Larger
quantities
of
strong
acid
or
strong
base
can
cause
a
significant
change
in
pH.
How
many
moles
of
nitric
acid,
HNO3,
may
be
added
before
the
pH
begins
to
change
significantly?
(A)
0.250
mole
(B)
0.500
mole
(C)
3.00
mole
(D)
1.00
mole
(E)
3.50
mole
(F)
1.50
mole
(G)
2.50
mole
(H)
4.00
mole
(I)
0.750
mole
(J)
2.00
mole"
(A)
0.250
mole
(B)
0.500
mole
(C)
3.00
mole
(D)
1.00
mole
(E)
3.50
mole
(F)
1.50
mole
(G)
2.50
mole
(H)
4.00
mole
(I)
0.750
mole
(J)
2.00
mole"
target
:
"
Let's
think
step
by
step.
We
would
like
to
compute
the
buffer
capacity
of
this
solution.
First
we
write
the
equation
for
the
ionization
of
the
weak
acid,
in
this
case
of
acetic
acid.
$CH_{3}COOH
(aq)
+
H_{2}O
\\
rightarrow
H_{3}O^{+}
+
CH3COO^{-}$.
The
conjugate
base
is
therefore
the
acetate
ion.
The
added
strong
acid,
Nitric
acid,
will
react
with
the
conjugate
base.
Therefore
the
maximum
amount
of
acid
that
can
be
added
will
be
equal
to
the
amount
of
acetate
ion,
or
2
moles.
The
answer
is
(J)."
target
:
"
Let's
think
step
by
step.
We
would
like
to
compute
the
buffer
capacity
of
this
solution.
First
we
write
the
equation
for
the
ionization
of
the
weak
acid,
in
this
case
of
acetic
acid.
$CH_{3}COOH
(aq)
+
H_{2}O
\\
rightarrow
H_{3}O^{+}
+
CH3COO^{-}$.
The
conjugate
base
is
therefore
the
acetate
ion.
The
added
strong
acid,
Nitric
acid,
will
react
with
the
conjugate
base.
Therefore
the
maximum
amount
of
acid
that
can
be
added
will
be
equal
to
the
amount
of
acetate
ion,
or
2
moles.
The
answer
is
(J)."
group
:
mmlu_pro_flan_cot_fewshot_stem
tag
:
mmlu_pro_flan_cot_fewshot_stem
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_chemistry
task
:
mmlu_pro_flan_cot_fewshot_chemistry
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_computer_science.yaml
View file @
cd8642e7
...
@@ -58,6 +58,6 @@ fewshot_config:
...
@@ -58,6 +58,6 @@ fewshot_config:
What
is
displayed
as
a
result
of
running
the
program?
What
is
displayed
as
a
result
of
running
the
program?
(A)
November
(B)
Foxtrot
(C)
Zulu
(D)
Alpha
(E)
Charlie
(F)
Bravo
(G)
Yankee
(H)
Echo
(I)
Hotel
(J)
Delta"
(A)
November
(B)
Foxtrot
(C)
Zulu
(D)
Alpha
(E)
Charlie
(F)
Bravo
(G)
Yankee
(H)
Echo
(I)
Hotel
(J)
Delta"
target
:
"
Let's
think
step
by
step.
Because
X
has
the
value
5,
the
first
conditional
IF
(X
<
0)
is
false,
so
we
move
to
the
first
ELSE
clause.
Because
X
is
5
and
Y
is
10,
the
second
conditional
IF
(X
>
Y)
is
false,
so
we
move
to
the
following
ELSE
clause.
Since
Y
is
10,
the
conditional
IF
(Y
>
0)
is
true,
so
the
command
DISPLAY
(
\"
November
\"
)
is
executed.
The
answer
is
(A)."
target
:
"
Let's
think
step
by
step.
Because
X
has
the
value
5,
the
first
conditional
IF
(X
<
0)
is
false,
so
we
move
to
the
first
ELSE
clause.
Because
X
is
5
and
Y
is
10,
the
second
conditional
IF
(X
>
Y)
is
false,
so
we
move
to
the
following
ELSE
clause.
Since
Y
is
10,
the
conditional
IF
(Y
>
0)
is
true,
so
the
command
DISPLAY
(
\"
November
\"
)
is
executed.
The
answer
is
(A)."
group
:
mmlu_pro_flan_cot_fewshot_stem
tag
:
mmlu_pro_flan_cot_fewshot_stem
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_computer_science
task
:
mmlu_pro_flan_cot_fewshot_computer_science
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_economics.yaml
View file @
cd8642e7
...
@@ -31,6 +31,6 @@ fewshot_config:
...
@@ -31,6 +31,6 @@ fewshot_config:
-
question
:
"
The
concentration
ratio
for
a
monopoly
is
-
question
:
"
The
concentration
ratio
for
a
monopoly
is
(A)
50
(B)
5
(C)
10
(D)
90
(E)
15
(F)
100
(G)
0
(H)
25
(I)
75
(J)
N/A"
(A)
50
(B)
5
(C)
10
(D)
90
(E)
15
(F)
100
(G)
0
(H)
25
(I)
75
(J)
N/A"
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
help.
The
concentration
ratio
is
calculated
as
the
sum
of
market
share
of
a
specific
number
of
largest
companies.
Monopoly
means
one
company
or
entity
controls
the
entire
market,
therefore,
the
concentration
ratio
is
100
percent.
The
answer
is
(F)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
microeconomics
for
help.
The
concentration
ratio
is
calculated
as
the
sum
of
market
share
of
a
specific
number
of
largest
companies.
Monopoly
means
one
company
or
entity
controls
the
entire
market,
therefore,
the
concentration
ratio
is
100
percent.
The
answer
is
(F)."
group
:
mmlu_pro_flan_cot_fewshot_social_sciences
tag
:
mmlu_pro_flan_cot_fewshot_social_sciences
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_economics
task
:
mmlu_pro_flan_cot_fewshot_economics
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_engineering.yaml
View file @
cd8642e7
...
@@ -18,6 +18,6 @@ fewshot_config:
...
@@ -18,6 +18,6 @@ fewshot_config:
-
question
:
"
In
a
2
pole
lap
winding
dc
machine,
the
resistance
of
one
conductor
is
2Ω
and
the
total
number
of
conductors
is
100.
Find
the
total
resistance
-
question
:
"
In
a
2
pole
lap
winding
dc
machine,
the
resistance
of
one
conductor
is
2Ω
and
the
total
number
of
conductors
is
100.
Find
the
total
resistance
(A)
50Ω
(B)
1Ω
(C)
25Ω
(D)
200Ω
(E)
10Ω
(F)
100Ω
(G)
500Ω
(H)
150Ω
(I)
75Ω
(J)
20Ω"
(A)
50Ω
(B)
1Ω
(C)
25Ω
(D)
200Ω
(E)
10Ω
(F)
100Ω
(G)
500Ω
(H)
150Ω
(I)
75Ω
(J)
20Ω"
target
:
"
Let's
think
step
by
step.
In
lap
winding,
effectively
two
resistors
are
connected
in
parallel,
so
the
actual
resistance
of
each
pair
is
1
Ohm.
Since
we
have
50
pairs,
we
get
a
total
resistance
of
50
Ohms.
The
answer
is
(A)."
target
:
"
Let's
think
step
by
step.
In
lap
winding,
effectively
two
resistors
are
connected
in
parallel,
so
the
actual
resistance
of
each
pair
is
1
Ohm.
Since
we
have
50
pairs,
we
get
a
total
resistance
of
50
Ohms.
The
answer
is
(A)."
group
:
mmlu_pro_flan_cot_fewshot_stem
tag
:
mmlu_pro_flan_cot_fewshot_stem
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_engineering
task
:
mmlu_pro_flan_cot_fewshot_engineering
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_health.yaml
View file @
cd8642e7
...
@@ -13,6 +13,6 @@ fewshot_config:
...
@@ -13,6 +13,6 @@ fewshot_config:
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
clinical
knowledge
for
help.
According
to
the
medical
protocol
as
of
2020,
you
should
make
two
attempts
to
cannulate
a
patient
before
passing
the
job
on
to
a
more-senior
practitioner.
The
answer
is
(F)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
clinical
knowledge
for
help.
According
to
the
medical
protocol
as
of
2020,
you
should
make
two
attempts
to
cannulate
a
patient
before
passing
the
job
on
to
a
more-senior
practitioner.
The
answer
is
(F)."
-
question
:
"
Why
are
parvoviruses
a
highly
impactful
parasite?
(A)
They
are
able
to
alter
the
host's
DNA
(B)
Because
they
have
no
nucleic
acid
(C)
They
can
survive
in
extreme
temperatures
(D)
Only
replicate
in
dividing
cells
(E)
They
can
infect
multiple
species
(F)
They
don't
require
a
host
to
survive
(G)
Can
integrate
into
host
chromosomes
(H)
N/A
(I)
N/A
(J)
N/A"
-
question
:
"
Why
are
parvoviruses
a
highly
impactful
parasite?
(A)
They
are
able
to
alter
the
host's
DNA
(B)
Because
they
have
no
nucleic
acid
(C)
They
can
survive
in
extreme
temperatures
(D)
Only
replicate
in
dividing
cells
(E)
They
can
infect
multiple
species
(F)
They
don't
require
a
host
to
survive
(G)
Can
integrate
into
host
chromosomes
(H)
N/A
(I)
N/A
(J)
N/A"
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
Paroviruses
are
highly
impactful
because
they
do
not
have
nucleic
acid.
The
answer
is
(B)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
Paroviruses
are
highly
impactful
because
they
do
not
have
nucleic
acid.
The
answer
is
(B)."
group
:
mmlu_pro_flan_cot_fewshot_other
tag
:
mmlu_pro_flan_cot_fewshot_other
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_health
task
:
mmlu_pro_flan_cot_fewshot_health
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_history.yaml
View file @
cd8642e7
...
@@ -40,6 +40,6 @@ fewshot_config:
...
@@ -40,6 +40,6 @@ fewshot_config:
-
question
:
"
Recent
research
on
hominid
species
dating
from
the
Middle
Pliocene
indicates
there
was
(as
of
2020):
-
question
:
"
Recent
research
on
hominid
species
dating
from
the
Middle
Pliocene
indicates
there
was
(as
of
2020):
(A)
multiple
hominid
species
but
with
limited
diversity.
(B)
a
single
species
with
no
diversity.
(C)
decreased
species
diversity
but
increased
numbers
of
hammerstones
and
flakes,
indicating
stone
tool
manufacture.
(D)
a
single
dominant
species
that
outcompeted
all
others,
leading
to
decreased
diversity.
(E)
increased
species
diversity
due
to
a
prolonged
ice
age
followed
by
a
severe
drought.
(F)
decreased
species
diversity
due
to
a
prolonged
ice
age
followed
by
a
severe
drought.
(G)
a
great
amount
of
species
diversity,
or
a
single
species
that
exhibited
a
lot
of
diversity.
(H)
increased
species
diversity
but
with
decreased
population
numbers
due
to
harsh
climate
conditions.
(I)
increased
species
diversity
but
decreased
numbers
of
hammerstones
and
flakes,
indicating
less
stone
tool
manufacture.
(J)
very
little
species
diversity
during
this
period
and
very
few
hominids."
(A)
multiple
hominid
species
but
with
limited
diversity.
(B)
a
single
species
with
no
diversity.
(C)
decreased
species
diversity
but
increased
numbers
of
hammerstones
and
flakes,
indicating
stone
tool
manufacture.
(D)
a
single
dominant
species
that
outcompeted
all
others,
leading
to
decreased
diversity.
(E)
increased
species
diversity
due
to
a
prolonged
ice
age
followed
by
a
severe
drought.
(F)
decreased
species
diversity
due
to
a
prolonged
ice
age
followed
by
a
severe
drought.
(G)
a
great
amount
of
species
diversity,
or
a
single
species
that
exhibited
a
lot
of
diversity.
(H)
increased
species
diversity
but
with
decreased
population
numbers
due
to
harsh
climate
conditions.
(I)
increased
species
diversity
but
decreased
numbers
of
hammerstones
and
flakes,
indicating
less
stone
tool
manufacture.
(J)
very
little
species
diversity
during
this
period
and
very
few
hominids."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
prehistory
for
help.
Recent
research
has
recognized
multiple
hominid
species
from
the
Middle
Pliocene,
meaning
that
there
is
a
great
amount
of
species
diversity
or
diversity
in
a
single
species.
The
answer
is
(G)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
prehistory
for
help.
Recent
research
has
recognized
multiple
hominid
species
from
the
Middle
Pliocene,
meaning
that
there
is
a
great
amount
of
species
diversity
or
diversity
in
a
single
species.
The
answer
is
(G)."
group
:
mmlu_pro_flan_cot_fewshot_humanities
tag
:
mmlu_pro_flan_cot_fewshot_humanities
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_history
task
:
mmlu_pro_flan_cot_fewshot_history
lm_eval/tasks/mmlu_pro/flan_cot_fewshot/mmlu_pro_law.yaml
View file @
cd8642e7
...
@@ -18,6 +18,6 @@ fewshot_config:
...
@@ -18,6 +18,6 @@ fewshot_config:
-
question
:
"
A
state
has
recently
enacted
a
statute
prohibiting
the
disposal
of
any
nuclear
wastes
within
the
state.
This
law
does
not
contravene
or
conflict
with
any
federal
statutes.
A
man
operates
a
company
in
the
state
that
is
engaged
in
the
disposal
of
nuclear
wastes.
Subsequent
to
the
passage
of
the
state
statute,
the
man,
not
yet
aware
of
the
new
law,
entered
into
contracts
with
many
out-of-state
firms
to
dispose
of
their
nuclear
wastes
in
the
state.
On
account
of
this
new
law,
however,
the
man
will
be
unable
to
perform
these
contracts.
Assume
that
the
man
has
standing
to
challenge
this
state
law.
Which
of
the
following
presents
his
strongest
constitutional
grounds
to
challenge
the
state
law
prohibiting
the
disposal
of
nuclear
wastes
within
the
state?
-
question
:
"
A
state
has
recently
enacted
a
statute
prohibiting
the
disposal
of
any
nuclear
wastes
within
the
state.
This
law
does
not
contravene
or
conflict
with
any
federal
statutes.
A
man
operates
a
company
in
the
state
that
is
engaged
in
the
disposal
of
nuclear
wastes.
Subsequent
to
the
passage
of
the
state
statute,
the
man,
not
yet
aware
of
the
new
law,
entered
into
contracts
with
many
out-of-state
firms
to
dispose
of
their
nuclear
wastes
in
the
state.
On
account
of
this
new
law,
however,
the
man
will
be
unable
to
perform
these
contracts.
Assume
that
the
man
has
standing
to
challenge
this
state
law.
Which
of
the
following
presents
his
strongest
constitutional
grounds
to
challenge
the
state
law
prohibiting
the
disposal
of
nuclear
wastes
within
the
state?
(A)
The
second
amendment
-
the
right
to
bear
arms.
(B)
The
due
process
clause
of
the
Fourteenth
Amendment.
(C)
The
tenth
amendment
-
powers
not
delegated
to
the
United
States
by
the
Constitution.
(D)
The
first
amendment
-
freedom
of
speech.
(E)
The
privileges
and
immunities
clause
of
Article
IV,
Section
2.
(F)
The
commerce
clause.
(G)
The
sixth
amendment
-
right
to
a
fair
trial.
(H)
The
eighth
amendment
-
prohibition
of
cruel
and
unusual
punishment.
(I)
The
equal
protection
clause
of
the
Fourteenth
Amendment.
(J)
N/A"
(A)
The
second
amendment
-
the
right
to
bear
arms.
(B)
The
due
process
clause
of
the
Fourteenth
Amendment.
(C)
The
tenth
amendment
-
powers
not
delegated
to
the
United
States
by
the
Constitution.
(D)
The
first
amendment
-
freedom
of
speech.
(E)
The
privileges
and
immunities
clause
of
Article
IV,
Section
2.
(F)
The
commerce
clause.
(G)
The
sixth
amendment
-
right
to
a
fair
trial.
(H)
The
eighth
amendment
-
prohibition
of
cruel
and
unusual
punishment.
(I)
The
equal
protection
clause
of
the
Fourteenth
Amendment.
(J)
N/A"
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
law
for
help.
The
commerce
clause
states
that
Congress
shall
have
the
power
to
regulate
commerce
with
foreign
Nations,
and
among
the
several
States,
and
with
the
Indian
Tribes.
The
statute
affects
inter-state
commerce
which
puts
it
into
question.
Hence
the
man's
strongest
argument
should
be
the
commerce
clause.
The
answer
is
(F)."
target
:
"
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
law
for
help.
The
commerce
clause
states
that
Congress
shall
have
the
power
to
regulate
commerce
with
foreign
Nations,
and
among
the
several
States,
and
with
the
Indian
Tribes.
The
statute
affects
inter-state
commerce
which
puts
it
into
question.
Hence
the
man's
strongest
argument
should
be
the
commerce
clause.
The
answer
is
(F)."
group
:
mmlu_pro_flan_cot_fewshot_humanities
tag
:
mmlu_pro_flan_cot_fewshot_humanities
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
include
:
_mmlu_pro_flan_cot_fewshot_template_yaml
task
:
mmlu_pro_flan_cot_fewshot_law
task
:
mmlu_pro_flan_cot_fewshot_law
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment