Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
6348b947
Commit
6348b947
authored
Jul 03, 2024
by
haileyschoelkopf
Browse files
fix bbh aggregation filter usage
parent
94673d40
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
8 additions
and
4 deletions
+8
-4
lm_eval/tasks/bbh/cot_fewshot/_bbh.yaml
lm_eval/tasks/bbh/cot_fewshot/_bbh.yaml
+1
-0
lm_eval/tasks/bbh/cot_fewshot/_bbh_cot_fewshot.yaml
lm_eval/tasks/bbh/cot_fewshot/_bbh_cot_fewshot.yaml
+2
-1
lm_eval/tasks/bbh/cot_zeroshot/_bbh_cot_zeroshot.yaml
lm_eval/tasks/bbh/cot_zeroshot/_bbh_cot_zeroshot.yaml
+2
-1
lm_eval/tasks/bbh/fewshot/_bbh_fewshot.yaml
lm_eval/tasks/bbh/fewshot/_bbh_fewshot.yaml
+1
-1
lm_eval/tasks/bbh/zeroshot/_bbh_zeroshot.yaml
lm_eval/tasks/bbh/zeroshot/_bbh_zeroshot.yaml
+2
-1
No files found.
lm_eval/tasks/bbh/cot_fewshot/_bbh.yaml
View file @
6348b947
...
...
@@ -31,5 +31,6 @@ aggregate_metric_list:
-
metric
:
exact_match
aggregation
:
mean
weight_by_size
:
true
filter_list
:
get-answer
metadata
:
version
:
2.0
lm_eval/tasks/bbh/cot_fewshot/_bbh_cot_fewshot.yaml
View file @
6348b947
...
...
@@ -5,7 +5,7 @@ task:
-
bbh_cot_fewshot_date_understanding
-
bbh_cot_fewshot_disambiguation_qa
-
bbh_cot_fewshot_dyck_languages
-
bbh_cot_fewshot_formal_
languag
es
-
bbh_cot_fewshot_formal_
fallaci
es
-
bbh_cot_fewshot_geometric_shapes
-
bbh_cot_fewshot_hyperbaton
-
bbh_cot_fewshot_logical_deduction_five_objects
...
...
@@ -31,5 +31,6 @@ aggregate_metric_list:
-
metric
:
exact_match
aggregation
:
mean
weight_by_size
:
true
filter_list
:
get-answer
metadata
:
version
:
2.0
lm_eval/tasks/bbh/cot_zeroshot/_bbh_cot_zeroshot.yaml
View file @
6348b947
...
...
@@ -5,7 +5,7 @@ task:
-
bbh_cot_zeroshot_date_understanding
-
bbh_cot_zeroshot_disambiguation_qa
-
bbh_cot_zeroshot_dyck_languages
-
bbh_cot_zeroshot_formal_
languag
es
-
bbh_cot_zeroshot_formal_
fallaci
es
-
bbh_cot_zeroshot_geometric_shapes
-
bbh_cot_zeroshot_hyperbaton
-
bbh_cot_zeroshot_logical_deduction_five_objects
...
...
@@ -31,5 +31,6 @@ aggregate_metric_list:
-
metric
:
exact_match
aggregation
:
mean
weight_by_size
:
true
filter_list
:
flexible-extract
metadata
:
version
:
2.0
lm_eval/tasks/bbh/fewshot/_bbh_fewshot.yaml
View file @
6348b947
...
...
@@ -5,7 +5,7 @@ task:
-
bbh_fewshot_date_understanding
-
bbh_fewshot_disambiguation_qa
-
bbh_fewshot_dyck_languages
-
bbh_fewshot_formal_
languag
es
-
bbh_fewshot_formal_
fallaci
es
-
bbh_fewshot_geometric_shapes
-
bbh_fewshot_hyperbaton
-
bbh_fewshot_logical_deduction_five_objects
...
...
lm_eval/tasks/bbh/zeroshot/_bbh_zeroshot.yaml
View file @
6348b947
...
...
@@ -5,7 +5,7 @@ task:
-
bbh_zeroshot_date_understanding
-
bbh_zeroshot_disambiguation_qa
-
bbh_zeroshot_dyck_languages
-
bbh_zeroshot_formal_
languag
es
-
bbh_zeroshot_formal_
fallaci
es
-
bbh_zeroshot_geometric_shapes
-
bbh_zeroshot_hyperbaton
-
bbh_zeroshot_logical_deduction_five_objects
...
...
@@ -31,5 +31,6 @@ aggregate_metric_list:
-
metric
:
exact_match
aggregation
:
mean
weight_by_size
:
true
filter_list
:
flexible-extract
metadata
:
version
:
2.0
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment