Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
d684b9eb
Commit
d684b9eb
authored
Dec 18, 2024
by
Baber
Browse files
fix do_sample
parent
adbfcce1
Changes
33
Hide whitespace changes
Inline
Side-by-side
Showing
13 changed files
with
13 additions
and
13 deletions
+13
-13
lm_eval/tasks/longbench/passage_retrieval_en_e.yaml
lm_eval/tasks/longbench/passage_retrieval_en_e.yaml
+1
-1
lm_eval/tasks/longbench/passage_retrieval_zh.yaml
lm_eval/tasks/longbench/passage_retrieval_zh.yaml
+1
-1
lm_eval/tasks/longbench/qasper.yaml
lm_eval/tasks/longbench/qasper.yaml
+1
-1
lm_eval/tasks/longbench/qasper_e.yaml
lm_eval/tasks/longbench/qasper_e.yaml
+1
-1
lm_eval/tasks/longbench/qmsum.yaml
lm_eval/tasks/longbench/qmsum.yaml
+1
-1
lm_eval/tasks/longbench/repobench-p.yaml
lm_eval/tasks/longbench/repobench-p.yaml
+1
-1
lm_eval/tasks/longbench/repobench-p_e.yaml
lm_eval/tasks/longbench/repobench-p_e.yaml
+1
-1
lm_eval/tasks/longbench/samsum.yaml
lm_eval/tasks/longbench/samsum.yaml
+1
-1
lm_eval/tasks/longbench/samsum_e.yaml
lm_eval/tasks/longbench/samsum_e.yaml
+1
-1
lm_eval/tasks/longbench/trec.yaml
lm_eval/tasks/longbench/trec.yaml
+1
-1
lm_eval/tasks/longbench/trec_e.yaml
lm_eval/tasks/longbench/trec_e.yaml
+1
-1
lm_eval/tasks/longbench/triviaqa.yaml
lm_eval/tasks/longbench/triviaqa.yaml
+1
-1
lm_eval/tasks/longbench/triviaqa_e.yaml
lm_eval/tasks/longbench/triviaqa_e.yaml
+1
-1
No files found.
lm_eval/tasks/longbench/passage_retrieval_en_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
32
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.retrieval_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/passage_retrieval_zh.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
32
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.retrieval_zh_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/qasper.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
128
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.qa_f1_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/qasper_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
128
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.qa_f1_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/qmsum.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
512
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.rouge_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/repobench-p.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
64
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.code_sim_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/repobench-p_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
64
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.code_sim_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/samsum.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
128
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.rouge_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/samsum_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
128
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.rouge_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/trec.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
64
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.classification_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/trec_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
64
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.classification_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/triviaqa.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
32
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.qa_f1_score
aggregation
:
mean
...
...
lm_eval/tasks/longbench/triviaqa_e.yaml
View file @
d684b9eb
...
...
@@ -10,7 +10,7 @@ doc_to_target: '{{answers}}'
generation_kwargs
:
max_gen_toks
:
32
temperature
:
1
do_sample
:
Fals
e
do_sample
:
Tru
e
metric_list
:
-
metric
:
!function
metrics.qa_f1_score
aggregation
:
mean
...
...
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment