Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
44a602ab
Commit
44a602ab
authored
Jun 25, 2024
by
haileyschoelkopf
Browse files
add many explicit group configs
parent
c9801daf
Changes
69
Hide whitespace changes
Inline
Side-by-side
Showing
9 changed files
with
30 additions
and
16 deletions
+30
-16
lm_eval/tasks/mmlu/flan_n_shot/loglikelihood/_mmlu.yaml
lm_eval/tasks/mmlu/flan_n_shot/loglikelihood/_mmlu.yaml
+5
-5
lm_eval/tasks/mmlu/generative/_mmlu.yaml
lm_eval/tasks/mmlu/generative/_mmlu.yaml
+5
-5
lm_eval/tasks/okapi/arc_multilingual/_arc_yaml
lm_eval/tasks/okapi/arc_multilingual/_arc_yaml
+1
-1
lm_eval/tasks/okapi/hellaswag_multilingual/_hellaswag_yaml
lm_eval/tasks/okapi/hellaswag_multilingual/_hellaswag_yaml
+1
-1
lm_eval/tasks/okapi/mmlu_multilingual/_default_yaml
lm_eval/tasks/okapi/mmlu_multilingual/_default_yaml
+1
-1
lm_eval/tasks/okapi/truthfulqa_multilingual/_truthfulqa_mc1_yaml
.../tasks/okapi/truthfulqa_multilingual/_truthfulqa_mc1_yaml
+1
-1
lm_eval/tasks/paws-x/_pawsx.yaml
lm_eval/tasks/paws-x/_pawsx.yaml
+15
-0
lm_eval/tasks/paws-x/pawsx_template_yaml
lm_eval/tasks/paws-x/pawsx_template_yaml
+0
-1
lm_eval/tasks/qa4mre/qa4mre_2011.yaml
lm_eval/tasks/qa4mre/qa4mre_2011.yaml
+1
-1
No files found.
lm_eval/tasks/mmlu/flan_n_shot/loglikelihood/_mmlu.yaml
View file @
44a602ab
...
...
@@ -4,28 +4,28 @@ task:
-
group
:
stem
task
:
-
mmlu_flan_n_shot_loglikelihood_stem
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
other
task
:
-
mmlu_flan_n_shot_loglikelihood_other
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
social sciences
task
:
-
mmlu_flan_n_shot_loglikelihood_social_sciences
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
humanities
task
:
-
mmlu_flan_n_shot_loglikelihood_humanities
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
metadata
:
...
...
lm_eval/tasks/mmlu/generative/_mmlu.yaml
View file @
44a602ab
...
...
@@ -4,28 +4,28 @@ task:
-
group
:
stem
task
:
-
mmlu_stem_generative
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
other
task
:
-
mmlu_other_generative
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
social sciences
task
:
-
mmlu_social_sciences_generative
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
-
group
:
humanities
task
:
-
mmlu_humanities_generative
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
aggregate_metric
:
aggregate_metric
_list
:
-
metric
:
acc
weight_by_size
:
True
metadata
:
...
...
lm_eval/tasks/okapi/arc_multilingual/_arc_yaml
View file @
44a602ab
group
:
tag
:
- arc_multilingual
dataset_path: null
dataset_name: null
...
...
lm_eval/tasks/okapi/hellaswag_multilingual/_hellaswag_yaml
View file @
44a602ab
group
:
tag
:
- hellaswag_multilingual
dataset_path: null
dataset_name: null
...
...
lm_eval/tasks/okapi/mmlu_multilingual/_default_yaml
View file @
44a602ab
group
:
tag
:
- m_mmlu
dataset_path: alexandrainst/m_mmlu
test_split: test
...
...
lm_eval/tasks/okapi/truthfulqa_multilingual/_truthfulqa_mc1_yaml
View file @
44a602ab
group
:
tag
:
- truthfulqa_multilingual
dataset_path: null
dataset_name: null
...
...
lm_eval/tasks/paws-x/_pawsx.yaml
0 → 100644
View file @
44a602ab
group
:
pawsx
task
:
-
paws_en
-
paws_de
-
paws_es
-
paws_fr
-
paws_ja
-
paws_ko
-
paws_zh
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
true
metadata
:
version
:
0.0
lm_eval/tasks/paws-x/pawsx_template_yaml
View file @
44a602ab
# This file will be included in the generated language-specific task configs.
# It doesn't have a yaml file extension as it is not meant to be imported directly
# by the harness.
group: pawsx
task: null
dataset_path: paws-x
dataset_name: null
...
...
lm_eval/tasks/qa4mre/qa4mre_2011.yaml
View file @
44a602ab
group
:
tag
:
-
qa4mre
task
:
qa4mre_2011
dataset_path
:
qa4mre
...
...
Prev
1
2
3
4
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment