Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
a2af2101
Unverified
Commit
a2af2101
authored
Jul 12, 2024
by
Yen-Ting Lin
Committed by
GitHub
Jul 12, 2024
Browse files
Merge branch 'EleutherAI:main' into main
parents
82cb25c1
d5f39bf8
Changes
1000
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
93 additions
and
50 deletions
+93
-50
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
...val/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
+42
-27
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
+30
-4
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
...u/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
+4
-2
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
...l/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
...tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
.../tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
...mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
...asks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
...l/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
.../tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
...tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
+1
-1
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
+1
-1
No files found.
Too many changes to show.
To preserve performance only
1000 of 1000+
files are displayed.
Plain diff
Email patch
lm_eval/tasks/mmlu/flan_cot_fewshot/mmlu_world_religions.yaml
View file @
a2af2101
"
dataset_name"
:
"
world_religions"
dataset_name
:
world_religions
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
world
\
description
:
The following are multiple choice questions (with answers) about world
\
religions.
\n\n
Q:
How
can
the
Upanishads
be
characterized?
\n
(A)
Ritual
texts
(B)
\
religions.
\
Philosophical
texts
(C)
Hymns
(D)
Origin
stories
\n
A:
Let's
think
step
by
step.
\
fewshot_config
:
\
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
The
Upanishads
are
\
sampler
:
first_n
\
the
most
recent
part
of
Vedas
(the
oldest
scriptures
in
Hinduism)
and
supplied
\
samples
:
\
the
basis
of
later
Hindu
philosophy.
So
they
are
philosophical
texts.
The
answer
\
-
question
:
'
How
can
the
Upanishads
be
characterized?
\
is
(B).
\n\n
Q:
What
is
the
Second
Gem
in
Buddhism?
\n
(A)
The
Dharma
(B)
The
Sangha
\
\
(C)
The
Buddha
(D)
The
Bodhisattva
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
\
(A)
Ritual
texts
(B)
Philosophical
texts
(C)
Hymns
(D)
Origin
stories'
\
articles
on
world
religions
for
help.
The
Second
Gem
in
Buddhism
is
The
Dharma.
\
target
:
Let's think step by step. We refer to Wikipedia articles on world religions
\
The
answer
is
(A).
\n\n
Q:
Which
Japanese
government
promoted
a
kind
of
national
\
for help. The Upanishads are the most recent part of Vedas (the oldest scriptures
\
cult
based
on
the
emperor
and
his
associations
with
kami?
\n
(A)
Honen
(B)
Tanaka
\
in Hinduism) and supplied the basis of later Hindu philosophy. So they are philosophical
\
(C)
Tokugawa
(D)
Meiji
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
\
texts. The answer is (B).
\
on
world
religions
for
help.
The
promotion
of
a
national
cult
based
on
the
emperor
\
-
question
:
'
What
is
the
Second
Gem
in
Buddhism?
\
and
his
associations
with
Kami
happened
during
the
reign
of
Emperor
Meiji
(1852-1912).
\
\
The
answer
is
(D).
\n\n
Q:
In
which
dynasty
was
the
\"
Mandate
of
Heaven
\"
developed
\
(A)
The
Dharma
(B)
The
Sangha
(C)
The
Buddha
(D)
The
Bodhisattva'
\
to
legitimatize
the
new
rulers?
\n
(A)
Shang
(B)
Zhou
(C)
Han
(D)
Xia
\n
A:
Let's
\
target
:
Let's think step by step. We refer to Wikipedia articles on world religions
\
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
\
for help. The Second Gem in Buddhism is The Dharma. The answer is (A).
\
The
\"
Mandate
of
Heaven
\"
was
developed
as
an
ancient
Chinese
philosophical
concept
\
-
question
:
'
Which
Japanese
government
promoted
a
kind
of
national
cult
based
on
the
\
during
the
Zhou
Dynasty
(1046-256
BCE).
The
answer
is
(B).
\n\n
Q:
What
is
the
sign
\
emperor
and
his
associations
with
kami?
\
of
the
covenant
for
Jewish
males?
\n
(A)
The
rainbow
(B)
Circumcision
(C)
A
son
\
\
(D)
Bar
mitzvah
\n
A:
Let's
think
step
by
step.
We
refer
to
Wikipedia
articles
on
\
(A)
Honen
(B)
Tanaka
(C)
Tokugawa
(D)
Meiji'
\
world
religions
for
help.
In
Judaism,
the
most
distinctive
sign
of
the
covenant
\
target
:
Let's think step by step. We refer to Wikipedia articles on world religions
\
is
circumcision
(brit
milah).
The
answer
is
(B).
\n\n
"
for help. The promotion of a national cult based on the emperor and his associations
"
group"
:
"
mmlu_flan_cot_fewshot_humanities"
with Kami happened during the reign of Emperor Meiji (1852-1912). The answer
"
include"
:
"
_mmlu_flan_cot_fewshot_template_yaml"
is (D).
"
task"
:
"
mmlu_flan_cot_fewshot_world_religions"
-
question
:
'
In
which
dynasty
was
the
"Mandate
of
Heaven"
developed
to
legitimatize
the
new
rulers?
(A)
Shang
(B)
Zhou
(C)
Han
(D)
Xia'
target
:
Let's think step by step. We refer to Wikipedia articles on world religions
for help. The "Mandate of Heaven" was developed as an ancient Chinese philosophical
concept during the Zhou Dynasty (1046-256 BCE). The answer is (B).
-
question
:
'
What
is
the
sign
of
the
covenant
for
Jewish
males?
(A)
The
rainbow
(B)
Circumcision
(C)
A
son
(D)
Bar
mitzvah'
target
:
'
Let'
'
s
think
step
by
step.
We
refer
to
Wikipedia
articles
on
world
religions
for
help.
In
Judaism,
the
most
distinctive
sign
of
the
covenant
is
circumcision
(brit
milah).
The
answer
is
(B).'
tag
:
mmlu_flan_cot_fewshot_humanities
include
:
_mmlu_flan_cot_fewshot_template_yaml
task
:
mmlu_flan_cot_fewshot_world_religions
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu.yaml
View file @
a2af2101
group
:
mmlu_flan_cot_zeroshot
group
:
mmlu_flan_cot_zeroshot
group_alias
:
mmlu (flan style, zeroshot cot)
task
:
task
:
-
mmlu_flan_cot_zeroshot_stem
-
group
:
stem
-
mmlu_flan_cot_zeroshot_other
task
:
-
mmlu_flan_cot_zeroshot_social_sciences
-
mmlu_flan_cot_zeroshot_stem
-
mmlu_flan_cot_zeroshot_humanities
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
other
task
:
-
mmlu_flan_cot_zeroshot_other
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
social sciences
task
:
-
mmlu_flan_cot_zeroshot_social_sciences
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
humanities
task
:
-
mmlu_flan_cot_zeroshot_humanities
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
aggregate_metric_list
:
-
metric
:
acc
weight_by_size
:
True
metadata
:
version
:
1
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
View file @
a2af2101
...
@@ -8,7 +8,7 @@ filter_list:
...
@@ -8,7 +8,7 @@ filter_list:
- name: "strict-match"
- name: "strict-match"
filter:
filter:
- function: "regex"
- function: "regex"
regex_pattern: "((?<=The answer is )(.*)(?=.)|(?<=
the
answer is )(.*)(?=.)|(?<=The answer: )(.*)(?=.)|(?<=The final answer: )(.*)(?=.))"
regex_pattern: "((?<=The answer is )(.*)(?=.)|(?<=answer is )(.*)(?=.)|(?<=The answer: )(.*)(?=.)|(?<=The final answer: )(.*)(?=.))"
- function: "take_first"
- function: "take_first"
- name: "flexible-extract"
- name: "flexible-extract"
filter:
filter:
...
@@ -33,4 +33,6 @@ metric_list:
...
@@ -33,4 +33,6 @@ metric_list:
ignore_case: true
ignore_case: true
ignore_punctuation: true
ignore_punctuation: true
metadata:
metadata:
version: 1.0
version: 2.0
dataset_kwargs:
trust_remote_code: true
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_abstract_algebra.yaml
View file @
a2af2101
"
dataset_name"
:
"
abstract_algebra"
"
dataset_name"
:
"
abstract_algebra"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
abstract
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
abstract
\
\
algebra.
\n\n
"
\
algebra.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_abstract_algebra"
"
task"
:
"
mmlu_flan_cot_zeroshot_abstract_algebra"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_anatomy.yaml
View file @
a2af2101
"
dataset_name"
:
"
anatomy"
"
dataset_name"
:
"
anatomy"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
anatomy.
\n\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
anatomy.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_anatomy"
"
task"
:
"
mmlu_flan_cot_zeroshot_anatomy"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_astronomy.yaml
View file @
a2af2101
"
dataset_name"
:
"
astronomy"
"
dataset_name"
:
"
astronomy"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
astronomy.
\n\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
astronomy.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_astronomy"
"
task"
:
"
mmlu_flan_cot_zeroshot_astronomy"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_business_ethics.yaml
View file @
a2af2101
"
dataset_name"
:
"
business_ethics"
"
dataset_name"
:
"
business_ethics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
business
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
business
\
\
ethics.
\n\n
"
\
ethics.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_other"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_business_ethics"
"
task"
:
"
mmlu_flan_cot_zeroshot_business_ethics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_clinical_knowledge.yaml
View file @
a2af2101
"
dataset_name"
:
"
clinical_knowledge"
"
dataset_name"
:
"
clinical_knowledge"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
clinical
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
clinical
\
\
knowledge.
\n\n
"
\
knowledge.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_other"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_clinical_knowledge"
"
task"
:
"
mmlu_flan_cot_zeroshot_clinical_knowledge"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_biology.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_biology"
"
dataset_name"
:
"
college_biology"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
biology.
\n\n
"
\
biology.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_biology"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_biology"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_chemistry.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_chemistry"
"
dataset_name"
:
"
college_chemistry"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
chemistry.
\n\n
"
\
chemistry.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_chemistry"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_chemistry"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_computer_science.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_computer_science"
"
dataset_name"
:
"
college_computer_science"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
computer
science.
\n\n
"
\
computer
science.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_computer_science"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_computer_science"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_mathematics.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_mathematics"
"
dataset_name"
:
"
college_mathematics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
mathematics.
\n\n
"
\
mathematics.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_mathematics"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_mathematics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_medicine.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_medicine"
"
dataset_name"
:
"
college_medicine"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
medicine.
\n\n
"
\
medicine.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_other"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_other"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_medicine"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_medicine"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_college_physics.yaml
View file @
a2af2101
"
dataset_name"
:
"
college_physics"
"
dataset_name"
:
"
college_physics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
college
\
\
physics.
\n\n
"
\
physics.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_physics"
"
task"
:
"
mmlu_flan_cot_zeroshot_college_physics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_computer_security.yaml
View file @
a2af2101
"
dataset_name"
:
"
computer_security"
"
dataset_name"
:
"
computer_security"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
computer
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
computer
\
\
security.
\n\n
"
\
security.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_computer_security"
"
task"
:
"
mmlu_flan_cot_zeroshot_computer_security"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_conceptual_physics.yaml
View file @
a2af2101
"
dataset_name"
:
"
conceptual_physics"
"
dataset_name"
:
"
conceptual_physics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
conceptual
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
conceptual
\
\
physics.
\n\n
"
\
physics.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_conceptual_physics"
"
task"
:
"
mmlu_flan_cot_zeroshot_conceptual_physics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_econometrics.yaml
View file @
a2af2101
"
dataset_name"
:
"
econometrics"
"
dataset_name"
:
"
econometrics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
econometrics.
\n\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
econometrics.
\n\
\n
"
\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_social_sciences"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_social_sciences"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_econometrics"
"
task"
:
"
mmlu_flan_cot_zeroshot_econometrics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_electrical_engineering.yaml
View file @
a2af2101
"
dataset_name"
:
"
electrical_engineering"
"
dataset_name"
:
"
electrical_engineering"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
electrical
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
electrical
\
\
engineering.
\n\n
"
\
engineering.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_electrical_engineering"
"
task"
:
"
mmlu_flan_cot_zeroshot_electrical_engineering"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_elementary_mathematics.yaml
View file @
a2af2101
"
dataset_name"
:
"
elementary_mathematics"
"
dataset_name"
:
"
elementary_mathematics"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
elementary
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
elementary
\
\
mathematics.
\n\n
"
\
mathematics.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_stem"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_elementary_mathematics"
"
task"
:
"
mmlu_flan_cot_zeroshot_elementary_mathematics"
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
View file @
a2af2101
"
dataset_name"
:
"
formal_logic"
"
dataset_name"
:
"
formal_logic"
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
formal
\
"
description"
:
"
The
following
are
multiple
choice
questions
(with
answers)
about
formal
\
\
logic.
\n\n
"
\
logic.
\n\n
"
"
group
"
:
"
mmlu_flan_cot_zeroshot_humanities"
"
tag
"
:
"
mmlu_flan_cot_zeroshot_humanities"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
include"
:
"
_mmlu_flan_cot_zeroshot_template_yaml"
"
task"
:
"
mmlu_flan_cot_zeroshot_formal_logic"
"
task"
:
"
mmlu_flan_cot_zeroshot_formal_logic"
Prev
1
…
43
44
45
46
47
48
49
50
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment