Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
815f59e6
Unverified
Commit
815f59e6
authored
Nov 06, 2023
by
Lintang Sutawika
Committed by
GitHub
Nov 06, 2023
Browse files
Merge pull request #922 from EleutherAI/mmlu_subgroups
[Refactor] Mmlu subgroups and weight avg
parents
3533e4b9
44124d95
Changes
299
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
160 additions
and
233 deletions
+160
-233
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_virology.yaml
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_virology.yaml
+31
-55
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
...val/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
+27
-53
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
+6
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
...u/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
+0
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
...l/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
+6
-7
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
+6
-7
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
...tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
.../tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
...mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
...asks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
...l/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
.../tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
...tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
+6
-7
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
+6
-8
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
+6
-8
No files found.
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_virology.yaml
View file @
815f59e6
dataset_name
:
virology
"
dataset_name"
:
"
virology"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
virology.
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
virology.
\n\
\n
Q:
The
median
survival
time
to
AIDS
and
death
was
established
by
following:
\n\
(A)
Seroprevalent
HIV-infected
individuals
(B)
Seronegatives
(C)
Seroconverters
\
Q:
The
median
survival
time
to
AIDS
and
death
was
established
by
following:
\
(D)
High-risk
seronegatives
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
virology
for
help.
The
median
survival
time
to
AIDS
and
death
was
\
(A)
Seroprevalent
HIV-infected
individuals
(B)
Seronegatives
(C)
Seroconverters
\
established
as
a
result
of
the
development
of
seroconverters.
The
answer
is
(C).
\n\
(D)
High-risk
seronegatives
\n
Q:
Which
of
the
following
is
a
morphological
characteristic
of
the
paramyxoviruses.
\n\
(A)
Fragile
viruses
often
visualised
with
RNA
spewing
from
the
inside
(B)
Elongate
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
\
viruses
(C)
Icosahedral
viruses
with
envelope
(D)
Very
large
viruses
\n
A:
Let's
\
The
median
survival
time
to
AIDS
and
death
was
established
as
a
result
of
the
development
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
Paramyxoviruses
\
of
seroconverters.
The
answer
is
(C).
\
are
fragile
viruses
often
visualised
with
RNA
spewing
from
the
inside.
The
answer
\
\
is
(A).
\n\n
Q:
The
most
important
goal
of
a
behavioral
intervention
is:
\n
(A)
Change
\
\
in
behavior
(B)
Comprehensive
coverage
(C)
Effective
use
of
behavioral
theory
\
Q:
Which
of
the
following
is
a
morphological
characteristic
of
the
paramyxoviruses.
\
(D)
Sustained
behavior
change
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
\
articles
on
virology
for
help.
The
prim
goal
of
a
behavioral
intervention
is
to
\
(A)
Fragile
viruses
often
visualised
with
RNA
spewing
from
the
inside
(B)
Elongate
\
cause
sustained
behavior
change.
The
answer
is
(D).
\n\n
Q:
A
key
factor
facilitating
\
viruses
(C)
Icosahedral
viruses
with
envelope
(D)
Very
large
viruses
\
the
application
of
nested
case-control
studies
from
the
MACS
was:
\n
(A)
Data
collection
\
\
(B)
Establishment
of
a
repository
of
biologic
specimens
(C)
Participant
interest
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
\
(D)
Administration
of
the
questionnaire
by
staff
\n
A:
Let's
think
step
by
step.
\
Paramyxoviruses
are
fragile
viruses
often
visualised
with
RNA
spewing
from
the
inside.
\
We
refer
to
Wikipedia
articles
on
virology
for
help.
The
Multicenter
AIDS
Cohort
\
The
answer
is
(A).
\
Study's
use
of
nested
case-control
studies
was
facilitated
by
the
establishment
\
\
of
a
repository
of
biologic
specimens.
The
answer
is
(B).
\n\n
Q:
Why
are
parvoviruses
\
\
a
highly
impactful
parasite?
\n
(A)
Because
they
have
no
nucleic
acid
(B)
They
require
\
Q:
The
most
important
goal
of
a
behavioral
intervention
is:
\
a
helper
virus
(C)
Only
replicate
in
dividing
cells
(D)
Can
integrate
into
host
\
\
chromosomes
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
\
(A)
Change
in
behavior
(B)
Comprehensive
coverage
(C)
Effective
use
of
behavioral
\
for
help.
Paroviruses
are
highly
impactful
because
they
do
not
have
nucleic
acid.
\
theory
(D)
Sustained
behavior
change
\
The
answer
is
(A)."
"
group"
:
"
mmlu_flan_cot_fewshot_other"
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
The
prim
goal
of
a
behavioral
intervention
is
to
cause
sustained
behavior
change.
"
task"
:
"
mmlu_flan_cot_fewshot_virology"
The
answer
is
(D).
Q:
A
key
factor
facilitating
the
application
of
nested
case-control
studies
from
the
MACS
was:
(A)
Data
collection
(B)
Establishment
of
a
repository
of
biologic
specimens
(C)
Participant
interest
(D)
Administration
of
the
questionnaire
by
staff
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
The
Multicenter
AIDS
Cohort
Study'
'
s
use
of
nested
case-control
studies
was
facilitated
by
the
establishment
of
a
repository
of
biologic
specimens.
The
answer
is
(B).
Q:
Why
are
parvoviruses
a
highly
impactful
parasite?
(A)
Because
they
have
no
nucleic
acid
(B)
They
require
a
helper
virus
(C)
Only
replicate
in
dividing
cells
(D)
Can
integrate
into
host
chromosomes
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
virology
for
help.
Paroviruses
are
highly
impactful
because
they
do
not
have
nucleic
acid.
The
answer
is
(A).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_virology
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
View file @
815f59e6
dataset_name
:
world_religions
"
dataset_name"
:
"
world_religions"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
world
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
world
\
religions.
\
religions.
\n\n
Q:
How
can
the
Upanishads
be
characterized?
\n
(A)
Ritual
texts
(B)
\
\
Philosophical
texts
(C)
Hymns
(D)
Origin
stories
\n
A:
Let's
think
step
by
step.
\
\
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
The
Upanishads
are
\
Q:
How
can
the
Upanishads
be
characterized?
\
the
most
recent
part
of
Vedas
(the
oldest
scriptures
in
Hinduism)
and
supplied
\
\
the
basis
of
later
Hindu
philosophy.
So
they
are
philosophical
texts.
The
answer
\
(A)
Ritual
texts
(B)
Philosophical
texts
(C)
Hymns
(D)
Origin
stories
\
is
(B).
\n\n
Q:
What
is
the
Second
Gem
in
Buddhism?
\n
(A)
The
Dharma
(B)
The
Sangha
\
\
(C)
The
Buddha
(D)
The
Bodhisattva
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
\
articles
on
world
religions
for
help.
The
Second
Gem
in
Buddhism
is
The
Dharma.
\
for
help.
The
Upanishads
are
the
most
recent
part
of
Vedas
(the
oldest
scriptures
\
The
answer
is
(A).
\n\n
Q:
Which
Japanese
government
promoted
a
kind
of
national
\
in
Hinduism)
and
supplied
the
basis
of
later
Hindu
philosophy.
So
they
are
philosophical
\
cult
based
on
the
emperor
and
his
associations
with
kami?
\n
(A)
Honen
(B)
Tanaka
\
texts.
The
answer
is
(B).
\
(C)
Tokugawa
(D)
Meiji
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
\
on
world
religions
for
help.
The
promotion
of
a
national
cult
based
on
the
emperor
\
\
and
his
associations
with
Kami
happened
during
the
reign
of
Emperor
Meiji
(1852-1912).
\
Q:
What
is
the
Second
Gem
in
Buddhism?
\
The
answer
is
(D).
\n\n
Q:
In
which
dynasty
was
the
\"
Mandate
of
Heaven
\"
developed
\
\
to
legitimatize
the
new
rulers?
\n
(A)
Shang
(B)
Zhou
(C)
Han
(D)
Xia
\n
A:
Let's
\
(A)
The
Dharma
(B)
The
Sangha
(C)
The
Buddha
(D)
The
Bodhisattva
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
\
\
The
\"
Mandate
of
Heaven
\"
was
developed
as
an
ancient
Chinese
philosophical
concept
\
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
\
during
the
Zhou
Dynasty
(1046-256
BCE).
The
answer
is
(B).
\n\n
Q:
What
is
the
sign
\
for
help.
The
Second
Gem
in
Buddhism
is
The
Dharma.
The
answer
is
(A).
\
of
the
covenant
for
Jewish
males?
\n
(A)
The
rainbow
(B)
Circumcision
(C)
A
son
\
\
(D)
Bar
mitzvah
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
\
\
world
religions
for
help.
In
Judaism,
the
most
distinctive
sign
of
the
covenant
\
Q:
Which
Japanese
government
promoted
a
kind
of
national
cult
based
on
the
emperor
\
is
circumcision
(brit
milah).
The
answer
is
(B)."
and
his
associations
with
kami?
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
(A)
Honen
(B)
Tanaka
(C)
Tokugawa
(D)
Meiji
"
task"
:
"
mmlu_flan_cot_fewshot_world_religions"
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
The
promotion
of
a
national
cult
based
on
the
emperor
and
his
associations
with
Kami
happened
during
the
reign
of
Emperor
Meiji
(1852-1912).
The
answer
is
(D).
Q:
In
which
dynasty
was
the
"Mandate
of
Heaven"
developed
to
legitimatize
the
new
rulers?
(A)
Shang
(B)
Zhou
(C)
Han
(D)
Xia
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
The
"Mandate
of
Heaven"
was
developed
as
an
ancient
Chinese
philosophical
concept
during
the
Zhou
Dynasty
(1046-256
BCE).
The
answer
is
(B).
Q:
What
is
the
sign
of
the
covenant
for
Jewish
males?
(A)
The
rainbow
(B)
Circumcision
(C)
A
son
(D)
Bar
mitzvah
A:
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
In
Judaism,
the
most
distinctive
sign
of
the
covenant
is
circumcision
(brit
milah).
The
answer
is
(B).'
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_world_religions
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
0 → 100644
View file @
815f59e6
group
:
mmlu_flan_cot_zeroshot
task
:
-
mmlu_flan_cot_zeroshot_stem
-
mmlu_flan_cot_zeroshot_other
-
mmlu_flan_cot_zeroshot_social_sciences
-
mmlu_flan_cot_zeroshot_humanities
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_
generative
_template_yaml
→
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_
cot_zeroshot
_template_yaml
View file @
815f59e6
File moved
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
View file @
815f59e6
dataset_name
:
abstract_algebra
"
dataset_name"
:
"
abstract_algebra"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
abstract
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
abstract
\
algebra.
\
algebra.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_abstract_algebra"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_abstract_algebra
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
View file @
815f59e6
dataset_name
:
anatomy
"
dataset_name"
:
"
anatomy"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
anatomy.
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
anatomy.
\n\
\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
'
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
include
:
_mmlu_flan_generative_template_yaml
"
task"
:
"
mmlu_flan_cot_zeroshot_anatomy"
task
:
mmlu_flan_cot_zeroshot_anatomy
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
View file @
815f59e6
dataset_name
:
astronomy
"
dataset_name"
:
"
astronomy"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
astronomy.
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
astronomy.
\n\
\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
'
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
include
:
_mmlu_flan_generative_template_yaml
"
task"
:
"
mmlu_flan_cot_zeroshot_astronomy"
task
:
mmlu_flan_cot_zeroshot_astronomy
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
View file @
815f59e6
dataset_name
:
business_ethics
"
dataset_name"
:
"
business_ethics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
business
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
business
\
ethics.
\
ethics.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_business_ethics"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_business_ethics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
View file @
815f59e6
dataset_name
:
clinical_knowledge
"
dataset_name"
:
"
clinical_knowledge"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
clinical
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
clinical
\
knowledge.
\
knowledge.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_clinical_knowledge"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_clinical_knowledge
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
View file @
815f59e6
dataset_name
:
college_biology
"
dataset_name"
:
"
college_biology"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
biology.
\
biology.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_biology"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_biology
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
View file @
815f59e6
dataset_name
:
college_chemistry
"
dataset_name"
:
"
college_chemistry"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
chemistry.
\
chemistry.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_chemistry"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_chemistry
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
View file @
815f59e6
dataset_name
:
college_computer_science
"
dataset_name"
:
"
college_computer_science"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
computer
science.
\
computer
science.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_computer_science"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_computer_science
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
View file @
815f59e6
dataset_name
:
college_mathematics
"
dataset_name"
:
"
college_mathematics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
mathematics.
\
mathematics.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_mathematics"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_mathematics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
View file @
815f59e6
dataset_name
:
college_medicine
"
dataset_name"
:
"
college_medicine"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
medicine.
\
medicine.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_medicine"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_medicine
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
View file @
815f59e6
dataset_name
:
college_physics
"
dataset_name"
:
"
college_physics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
college
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
physics.
\
physics.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_college_physics"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_college_physics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
View file @
815f59e6
dataset_name
:
computer_security
"
dataset_name"
:
"
computer_security"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
computer
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
computer
\
security.
\
security.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_computer_security"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_computer_security
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
View file @
815f59e6
dataset_name
:
conceptual_physics
"
dataset_name"
:
"
conceptual_physics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
conceptual
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
conceptual
\
physics.
\
physics.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_conceptual_physics"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_conceptual_physics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
View file @
815f59e6
dataset_name
:
econometrics
"
dataset_name"
:
"
econometrics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
econometrics.
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
econometrics.
\n\
\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_social_sciences"
'
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
include
:
_mmlu_flan_generative_template_yaml
"
task"
:
"
mmlu_flan_cot_zeroshot_econometrics"
task
:
mmlu_flan_cot_zeroshot_econometrics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
View file @
815f59e6
dataset_name
:
electrical_engineering
"
dataset_name"
:
"
electrical_engineering"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
electrical
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
electrical
\
engineering.
\
engineering.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_electrical_engineering"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_electrical_engineering
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
View file @
815f59e6
dataset_name
:
elementary_mathematics
"
dataset_name"
:
"
elementary_mathematics"
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
elementary
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
elementary
\
mathematics.
\
mathematics.
\n\n
"
"
group"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
'
"
task"
:
"
mmlu_flan_cot_zeroshot_elementary_mathematics"
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_elementary_mathematics
Prev
1
…
3
4
5
6
7
8
9
10
11
…
15
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment