Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
e5dfd030
Unverified
Commit
e5dfd030
authored
Dec 07, 2023
by
Lintang Sutawika
Committed by
GitHub
Dec 07, 2023
Browse files
Merge pull request #1074 from EleutherAI/lintangsutawika-patch-4
Update _cot_fewshot_template_yaml
parents
f0b96491
965c5330
Changes
8
Hide whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
8 additions
and
1 deletion
+8
-1
README.md
README.md
+1
-1
lm_eval/tasks/bbh/cot_fewshot/_cot_fewshot_template_yaml
lm_eval/tasks/bbh/cot_fewshot/_cot_fewshot_template_yaml
+1
-0
lm_eval/tasks/bbh/cot_zeroshot/_cot_zeroshot_template_yaml
lm_eval/tasks/bbh/cot_zeroshot/_cot_zeroshot_template_yaml
+1
-0
lm_eval/tasks/bbh/fewshot/_fewshot_template_yaml
lm_eval/tasks/bbh/fewshot/_fewshot_template_yaml
+1
-0
lm_eval/tasks/bbh/zeroshot/_zeroshot_template_yaml
lm_eval/tasks/bbh/zeroshot/_zeroshot_template_yaml
+1
-0
lm_eval/tasks/minerva_math/minerva_math_algebra.yaml
lm_eval/tasks/minerva_math/minerva_math_algebra.yaml
+1
-0
lm_eval/tasks/mmlu/flan_cot_fewshot/_mmlu_flan_cot_fewshot_template_yaml
...mlu/flan_cot_fewshot/_mmlu_flan_cot_fewshot_template_yaml
+1
-0
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
...u/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
+1
-0
No files found.
README.md
View file @
e5dfd030
...
@@ -3,7 +3,7 @@
...
@@ -3,7 +3,7 @@
[

](https://doi.org/10.5281/zenodo.10256836)
[

](https://doi.org/10.5281/zenodo.10256836)
## Announcement
## Announcement
**A new v0.4.0 release of lm-evaluation-harness is available**
!
**A new v0.4.0 release of lm-evaluation-harness is available**
!
New updates and features include:
New updates and features include:
...
...
lm_eval/tasks/bbh/cot_fewshot/_cot_fewshot_template_yaml
View file @
e5dfd030
...
@@ -24,5 +24,6 @@ filter_list:
...
@@ -24,5 +24,6 @@ filter_list:
- function: "regex"
- function: "regex"
regex_pattern: "(?<=the answer is )(.*)(?=.)"
regex_pattern: "(?<=the answer is )(.*)(?=.)"
- function: "take_first"
- function: "take_first"
num_fewshot: 0
metadata:
metadata:
- version: 0.0
- version: 0.0
lm_eval/tasks/bbh/cot_zeroshot/_cot_zeroshot_template_yaml
View file @
e5dfd030
...
@@ -22,5 +22,6 @@ filter_list:
...
@@ -22,5 +22,6 @@ filter_list:
- function: "regex"
- function: "regex"
regex_pattern: "((?<=The answer is )(.*)(?=.)|(?<=the answer is )(.*)(?=.)|(?<=The answer: )(.*)(?=.)|(?<=The final answer: )(.*)(?=.))"
regex_pattern: "((?<=The answer is )(.*)(?=.)|(?<=the answer is )(.*)(?=.)|(?<=The answer: )(.*)(?=.)|(?<=The final answer: )(.*)(?=.))"
- function: "take_first"
- function: "take_first"
num_fewshot: 0
metadata:
metadata:
- version: 0
- version: 0
lm_eval/tasks/bbh/fewshot/_fewshot_template_yaml
View file @
e5dfd030
...
@@ -16,5 +16,6 @@ generation_kwargs:
...
@@ -16,5 +16,6 @@ generation_kwargs:
- "\n\n"
- "\n\n"
do_sample: false
do_sample: false
temperature: 0.0
temperature: 0.0
num_fewshot: 0
metadata:
metadata:
- version: 0
- version: 0
lm_eval/tasks/bbh/zeroshot/_zeroshot_template_yaml
View file @
e5dfd030
...
@@ -16,5 +16,6 @@ generation_kwargs:
...
@@ -16,5 +16,6 @@ generation_kwargs:
- "\n\n"
- "\n\n"
do_sample: false
do_sample: false
temperature: 0.0
temperature: 0.0
num_fewshot: 0
metadata:
metadata:
- version: 0
- version: 0
lm_eval/tasks/minerva_math/minerva_math_algebra.yaml
View file @
e5dfd030
...
@@ -19,5 +19,6 @@ metric_list:
...
@@ -19,5 +19,6 @@ metric_list:
-
metric
:
exact_match
-
metric
:
exact_match
aggregation
:
mean
aggregation
:
mean
higher_is_better
:
true
higher_is_better
:
true
num_fewshot
:
0
metadata
:
metadata
:
-
version
:
0.0
-
version
:
0.0
lm_eval/tasks/mmlu/flan_cot_fewshot/_mmlu_flan_cot_fewshot_template_yaml
View file @
e5dfd030
...
@@ -15,6 +15,7 @@ generation_kwargs:
...
@@ -15,6 +15,7 @@ generation_kwargs:
- "</s>"
- "</s>"
do_sample: false
do_sample: false
temperature: 0.0
temperature: 0.0
num_fewshot: 0
metric_list:
metric_list:
- metric: exact_match
- metric: exact_match
aggregation: mean
aggregation: mean
...
...
lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml
View file @
e5dfd030
...
@@ -15,6 +15,7 @@ generation_kwargs:
...
@@ -15,6 +15,7 @@ generation_kwargs:
- "</s>"
- "</s>"
do_sample: false
do_sample: false
temperature: 0.0
temperature: 0.0
num_fewshot: 0
metric_list:
metric_list:
- metric: exact_match
- metric: exact_match
aggregation: mean
aggregation: mean
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment