Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
6769119f
Unverified
Commit
6769119f
authored
Oct 06, 2023
by
Hailey Schoelkopf
Committed by
GitHub
Oct 06, 2023
Browse files
Merge pull request #816 from EleutherAI/flan-benchmark
[Refactor] Flan benchmark
parents
4824a832
7d5e511c
Changes
448
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
159 additions
and
0 deletions
+159
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_global_facts.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_global_facts.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_biology.yaml
...asks/mmlu/flan_cot_zeroshot/mmlu_high_school_biology.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_chemistry.yaml
...ks/mmlu/flan_cot_zeroshot/mmlu_high_school_chemistry.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_computer_science.yaml
.../flan_cot_zeroshot/mmlu_high_school_computer_science.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_european_history.yaml
.../flan_cot_zeroshot/mmlu_high_school_european_history.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_geography.yaml
...ks/mmlu/flan_cot_zeroshot/mmlu_high_school_geography.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_government_and_politics.yaml
...ot_zeroshot/mmlu_high_school_government_and_politics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_macroeconomics.yaml
...lu/flan_cot_zeroshot/mmlu_high_school_macroeconomics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_mathematics.yaml
.../mmlu/flan_cot_zeroshot/mmlu_high_school_mathematics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_microeconomics.yaml
...lu/flan_cot_zeroshot/mmlu_high_school_microeconomics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_physics.yaml
...asks/mmlu/flan_cot_zeroshot/mmlu_high_school_physics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_psychology.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_high_school_psychology.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_statistics.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_high_school_statistics.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_us_history.yaml
...s/mmlu/flan_cot_zeroshot/mmlu_high_school_us_history.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_world_history.yaml
...mlu/flan_cot_zeroshot/mmlu_high_school_world_history.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_human_aging.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_human_aging.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_human_sexuality.yaml
...al/tasks/mmlu/flan_cot_zeroshot/mmlu_human_sexuality.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_international_law.yaml
.../tasks/mmlu/flan_cot_zeroshot/mmlu_international_law.yaml
+8
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_jurisprudence.yaml
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_jurisprudence.yaml
+7
-0
No files found.
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_formal_logic.yaml
0 → 100644
View file @
6769119f
dataset_name
:
formal_logic
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
formal
logic.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_formal_logic
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_global_facts.yaml
0 → 100644
View file @
6769119f
dataset_name
:
global_facts
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
global
facts.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_global_facts
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_biology.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_biology
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
biology.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_biology
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_chemistry.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_chemistry
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
chemistry.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_chemistry
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_computer_science.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_computer_science
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
computer
science.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_computer_science
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_european_history.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_european_history
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
european
history.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_european_history
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_geography.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_geography
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
geography.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_geography
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_government_and_politics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_government_and_politics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
government
and
politics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_government_and_politics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_macroeconomics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_macroeconomics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
macroeconomics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_macroeconomics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_mathematics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_mathematics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
mathematics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_mathematics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_microeconomics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_microeconomics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
microeconomics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_microeconomics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_physics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_physics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
physics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_physics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_psychology.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_psychology
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
psychology.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_psychology
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_statistics.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_statistics
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
statistics.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_statistics
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_us_history.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_us_history
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
us
history.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_us_history
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_high_school_world_history.yaml
0 → 100644
View file @
6769119f
dataset_name
:
high_school_world_history
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
high
school
world
history.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_high_school_world_history
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_human_aging.yaml
0 → 100644
View file @
6769119f
dataset_name
:
human_aging
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
human
aging.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_human_aging
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_human_sexuality.yaml
0 → 100644
View file @
6769119f
dataset_name
:
human_sexuality
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
human
sexuality.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_human_sexuality
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_international_law.yaml
0 → 100644
View file @
6769119f
dataset_name
:
international_law
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
international
law.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_international_law
lm_eval/tasks/mmlu/flan_cot_zeroshot/mmlu_jurisprudence.yaml
0 → 100644
View file @
6769119f
dataset_name
:
jurisprudence
description
:
'
The
following
are
multiple
choice
questions
(with
answers)
about
jurisprudence.
'
include
:
_mmlu_flan_generative_template_yaml
task
:
mmlu_flan_cot_zeroshot_jurisprudence
Prev
1
…
11
12
13
14
15
16
17
18
19
…
23
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment