Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
e1ae8a2f
Commit
e1ae8a2f
authored
Nov 26, 2023
by
Herbie Bradley
Browse files
Merge remote-tracking branch 'origin/big-refactor' into calibration
parents
50e99bd7
30936bc7
Changes
1000
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
100 additions
and
0 deletions
+100
-0
lm_eval/tasks/belebele/belebele_tso_Latn.yaml
lm_eval/tasks/belebele/belebele_tso_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_tur_Latn.yaml
lm_eval/tasks/belebele/belebele_tur_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_ukr_Cyrl.yaml
lm_eval/tasks/belebele/belebele_ukr_Cyrl.yaml
+3
-0
lm_eval/tasks/belebele/belebele_urd_Arab.yaml
lm_eval/tasks/belebele/belebele_urd_Arab.yaml
+3
-0
lm_eval/tasks/belebele/belebele_urd_Latn.yaml
lm_eval/tasks/belebele/belebele_urd_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_uzn_Latn.yaml
lm_eval/tasks/belebele/belebele_uzn_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_vie_Latn.yaml
lm_eval/tasks/belebele/belebele_vie_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_war_Latn.yaml
lm_eval/tasks/belebele/belebele_war_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_wol_Latn.yaml
lm_eval/tasks/belebele/belebele_wol_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_xho_Latn.yaml
lm_eval/tasks/belebele/belebele_xho_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_yor_Latn.yaml
lm_eval/tasks/belebele/belebele_yor_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_zho_Hans.yaml
lm_eval/tasks/belebele/belebele_zho_Hans.yaml
+3
-0
lm_eval/tasks/belebele/belebele_zho_Hant.yaml
lm_eval/tasks/belebele/belebele_zho_Hant.yaml
+3
-0
lm_eval/tasks/belebele/belebele_zsm_Latn.yaml
lm_eval/tasks/belebele/belebele_zsm_Latn.yaml
+3
-0
lm_eval/tasks/belebele/belebele_zul_Latn.yaml
lm_eval/tasks/belebele/belebele_zul_Latn.yaml
+3
-0
lm_eval/tasks/benchmarks/flan/flan_anli.yaml
lm_eval/tasks/benchmarks/flan/flan_anli.yaml
+17
-0
lm_eval/tasks/benchmarks/flan/flan_arc.yaml
lm_eval/tasks/benchmarks/flan/flan_arc.yaml
+14
-0
lm_eval/tasks/benchmarks/flan/flan_boolq.yaml
lm_eval/tasks/benchmarks/flan/flan_boolq.yaml
+7
-0
lm_eval/tasks/benchmarks/flan/flan_cot.yaml
lm_eval/tasks/benchmarks/flan/flan_cot.yaml
+11
-0
lm_eval/tasks/benchmarks/flan/flan_held_in.yaml
lm_eval/tasks/benchmarks/flan/flan_held_in.yaml
+6
-0
No files found.
Too many changes to show.
To preserve performance only
1000 of 1000+
files are displayed.
Plain diff
Email patch
lm_eval/tasks/belebele/belebele_tso_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
tso_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_tso_Latn"
lm_eval/tasks/belebele/belebele_tur_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
tur_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_tur_Latn"
lm_eval/tasks/belebele/belebele_ukr_Cyrl.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
ukr_Cyrl"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_ukr_Cyrl"
lm_eval/tasks/belebele/belebele_urd_Arab.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
urd_Arab"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_urd_Arab"
lm_eval/tasks/belebele/belebele_urd_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
urd_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_urd_Latn"
lm_eval/tasks/belebele/belebele_uzn_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
uzn_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_uzn_Latn"
lm_eval/tasks/belebele/belebele_vie_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
vie_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_vie_Latn"
lm_eval/tasks/belebele/belebele_war_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
war_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_war_Latn"
lm_eval/tasks/belebele/belebele_wol_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
wol_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_wol_Latn"
lm_eval/tasks/belebele/belebele_xho_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
xho_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_xho_Latn"
lm_eval/tasks/belebele/belebele_yor_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
yor_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_yor_Latn"
lm_eval/tasks/belebele/belebele_zho_Hans.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
zho_Hans"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_zho_Hans"
lm_eval/tasks/belebele/belebele_zho_Hant.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
zho_Hant"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_zho_Hant"
lm_eval/tasks/belebele/belebele_zsm_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
zsm_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_zsm_Latn"
lm_eval/tasks/belebele/belebele_zul_Latn.yaml
0 → 100644
View file @
e1ae8a2f
"
dataset_name"
:
"
zul_Latn"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
belebele_zul_Latn"
lm_eval/tasks/benchmarks/flan/flan_anli.yaml
0 → 100644
View file @
e1ae8a2f
group
:
flan_anli
task
:
-
include
:
yaml_templates/held_in_template_yaml
task
:
anli_r1
dataset_path
:
anli
use_prompt
:
prompt_templates/anli.yaml:*
validation_split
:
dev_r1
-
include
:
yaml_templates/held_in_template_yaml
task
:
anli_r2
dataset_path
:
anli
use_prompt
:
prompt_templates/anli.yaml:*
validation_split
:
dev_r2
-
include
:
yaml_templates/held_in_template_yaml
task
:
anli_r3
dataset_path
:
anli
use_prompt
:
prompt_templates/anli.yaml:*
validation_split
:
dev_r3
lm_eval/tasks/benchmarks/flan/flan_arc.yaml
0 → 100644
View file @
e1ae8a2f
group
:
flan_arc
task
:
-
include
:
yaml_templates/held_in_template_yaml
task
:
arc_easy
dataset_path
:
ai2_arc
dataset_name
:
ARC-Easy
use_prompt
:
prompt_templates/arc.yaml:*
validation_split
:
validation
-
include
:
yaml_templates/held_in_template_yaml
task
:
arc_challenge
dataset_path
:
ai2_arc
dataset_name
:
ARC-Challenge
use_prompt
:
prompt_templates/arc.yaml:*
validation_split
:
validation
lm_eval/tasks/benchmarks/flan/flan_boolq.yaml
0 → 100644
View file @
e1ae8a2f
group
:
flan_boolq
task
:
-
include
:
yaml_templates/held_in_template_yaml
dataset_path
:
super_glue
dataset_name
:
boolq
use_prompt
:
prompt_templates/boolq.yaml:*
validation_split
:
validation
lm_eval/tasks/benchmarks/flan/flan_cot.yaml
0 → 100644
View file @
e1ae8a2f
group
:
flan_cot
task
:
-
include
:
yaml_templates/cot_template_yaml
dataset_path
:
gsmk
dataset_name
:
boolq
use_prompt
:
promptsource:*
validation_split
:
validation
-
include
:
yaml_templates/cot_template_yaml
dataset_path
:
EleutherAI/asdiv
use_prompt
:
promptsource:*
validation_split
:
validation
lm_eval/tasks/benchmarks/flan/flan_held_in.yaml
0 → 100644
View file @
e1ae8a2f
group
:
flan_held_in
task
:
-
flan_boolq
-
flan_rte
-
flan_anli
-
flan_arc
Prev
1
…
10
11
12
13
14
15
16
17
18
…
50
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment