Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
4eecbabb
Commit
4eecbabb
authored
Sep 16, 2024
by
Baber
Browse files
Merge branch 'main' into prefill
parents
dac8b534
fb963f0f
Changes
465
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
460 additions
and
0 deletions
+460
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_astronomy_light.yaml
...light/arabic_leaderboard_arabic_mmlu_astronomy_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_business_ethics_light.yaml
...arabic_leaderboard_arabic_mmlu_business_ethics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_clinical_knowledge_light.yaml
...bic_leaderboard_arabic_mmlu_clinical_knowledge_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_biology_light.yaml
...arabic_leaderboard_arabic_mmlu_college_biology_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_chemistry_light.yaml
...abic_leaderboard_arabic_mmlu_college_chemistry_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_computer_science_light.yaml
...aderboard_arabic_mmlu_college_computer_science_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_mathematics_light.yaml
...ic_leaderboard_arabic_mmlu_college_mathematics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_medicine_light.yaml
...rabic_leaderboard_arabic_mmlu_college_medicine_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_physics_light.yaml
...arabic_leaderboard_arabic_mmlu_college_physics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_computer_security_light.yaml
...abic_leaderboard_arabic_mmlu_computer_security_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_conceptual_physics_light.yaml
...bic_leaderboard_arabic_mmlu_conceptual_physics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_econometrics_light.yaml
...ht/arabic_leaderboard_arabic_mmlu_econometrics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_electrical_engineering_light.yaml
...leaderboard_arabic_mmlu_electrical_engineering_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_elementary_mathematics_light.yaml
...leaderboard_arabic_mmlu_elementary_mathematics_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_formal_logic_light.yaml
...ht/arabic_leaderboard_arabic_mmlu_formal_logic_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_global_facts_light.yaml
...ht/arabic_leaderboard_arabic_mmlu_global_facts_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_biology_light.yaml
...ic_leaderboard_arabic_mmlu_high_school_biology_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_chemistry_light.yaml
..._leaderboard_arabic_mmlu_high_school_chemistry_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_computer_science_light.yaml
...board_arabic_mmlu_high_school_computer_science_light.yaml
+23
-0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_european_history_light.yaml
...board_arabic_mmlu_high_school_european_history_light.yaml
+23
-0
No files found.
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_astronomy_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_astronomy_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
astronomy
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_business_ethics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_business_ethics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
business_ethics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_clinical_knowledge_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_clinical_knowledge_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
clinical_knowledge
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_biology_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_biology_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_biology
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_chemistry_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_chemistry_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_chemistry
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_computer_science_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_computer_science_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_computer_science
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_mathematics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_mathematics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_mathematics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_medicine_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_medicine_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_medicine
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_college_physics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_college_physics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
college_physics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_computer_security_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_computer_security_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
computer_security
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_conceptual_physics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_conceptual_physics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
conceptual_physics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_econometrics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_econometrics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
econometrics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_electrical_engineering_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_electrical_engineering_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
electrical_engineering
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_elementary_mathematics_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_elementary_mathematics_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
elementary_mathematics
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_formal_logic_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_formal_logic_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
formal_logic
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_global_facts_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_global_facts_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
global_facts
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_biology_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_high_school_biology_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
high_school_biology
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_chemistry_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_high_school_chemistry_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
high_school_chemistry
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_computer_science_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_high_school_computer_science_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
high_school_computer_science
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/arabic_leaderboard_light/arabic_leaderboard_arabic_mmlu_light/arabic_leaderboard_arabic_mmlu_high_school_european_history_light.yaml
0 → 100644
View file @
4eecbabb
task
:
arabic_leaderboard_arabic_mmlu_high_school_european_history_light
dataset_path
:
arcee-globe/Arabic_MMLU-10percent
dataset_name
:
high_school_european_history
output_type
:
multiple_choice
training_split
:
null
validation_split
:
dev
test_split
:
test
process_docs
:
!function
utils.process_docs
doc_to_text
:
"
{{query}}"
doc_to_target
:
"
{{gold}}"
doc_to_choice
:
"
choices"
fewshot_split
:
dev
fewshot_config
:
sampler
:
first_n
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
Prev
1
…
7
8
9
10
11
12
13
14
15
…
24
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment