Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
90ad5db7
"docs/source/en/model_doc/segformer.md" did not exist on "17a7b49bda15353cc49172a0cfeb839a9719e018"
Commit
90ad5db7
authored
Mar 01, 2024
by
lintangsutawika
Browse files
merged main
parents
f692caa9
b177c82c
Changes
484
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
63 additions
and
7 deletions
+63
-7
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_mechanical_engineering.yaml
...direct_hard/kmmlu_direct_hard_mechanical_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_nondestructive_testing.yaml
...direct_hard/kmmlu_direct_hard_nondestructive_testing.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_patent.yaml
...val/tasks/kmmlu/direct_hard/kmmlu_direct_hard_patent.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_political_science_and_sociology.yaml
...rd/kmmlu_direct_hard_political_science_and_sociology.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_psychology.yaml
...tasks/kmmlu/direct_hard/kmmlu_direct_hard_psychology.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_public_safety.yaml
...ks/kmmlu/direct_hard/kmmlu_direct_hard_public_safety.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_railway_and_automotive_engineering.yaml
...kmmlu_direct_hard_railway_and_automotive_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_real_estate.yaml
...asks/kmmlu/direct_hard/kmmlu_direct_hard_real_estate.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_refrigerating_machinery.yaml
...irect_hard/kmmlu_direct_hard_refrigerating_machinery.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_social_welfare.yaml
...s/kmmlu/direct_hard/kmmlu_direct_hard_social_welfare.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_taxation.yaml
...l/tasks/kmmlu/direct_hard/kmmlu_direct_hard_taxation.yaml
+3
-0
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_telecommunications_and_wireless_technology.yaml
...rect_hard_telecommunications_and_wireless_technology.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/_hard_kmmlu_yaml
lm_eval/tasks/kmmlu/hard/_hard_kmmlu_yaml
+6
-7
lm_eval/tasks/kmmlu/hard/kmmlu_hard_accounting.yaml
lm_eval/tasks/kmmlu/hard/kmmlu_hard_accounting.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_agricultural_sciences.yaml
...al/tasks/kmmlu/hard/kmmlu_hard_agricultural_sciences.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_aviation_engineering_and_maintenance.yaml
...hard/kmmlu_hard_aviation_engineering_and_maintenance.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_biology.yaml
lm_eval/tasks/kmmlu/hard/kmmlu_hard_biology.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_chemical_engineering.yaml
...val/tasks/kmmlu/hard/kmmlu_hard_chemical_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_chemistry.yaml
lm_eval/tasks/kmmlu/hard/kmmlu_hard_chemistry.yaml
+3
-0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_civil_engineering.yaml
lm_eval/tasks/kmmlu/hard/kmmlu_hard_civil_engineering.yaml
+3
-0
No files found.
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_mechanical_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
mechanical_engineering
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_mechanical_engineering
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_nondestructive_testing.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
nondestructive_testing
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_nondestructive_testing
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_patent.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
patent
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_patent
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_political_science_and_sociology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
political_science_and_sociology
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_political_science_and_sociology
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_psychology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
psychology
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_psychology
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_public_safety.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
public_safety
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_public_safety
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_railway_and_automotive_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
railway_and_automotive_engineering
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_railway_and_automotive_engineering
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_real_estate.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
real_estate
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_real_estate
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_refrigerating_machinery.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
refrigerating_machinery
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_refrigerating_machinery
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_social_welfare.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
social_welfare
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_social_welfare
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_taxation.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
taxation
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_taxation
lm_eval/tasks/kmmlu/direct_hard/kmmlu_direct_hard_telecommunications_and_wireless_technology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
telecommunications_and_wireless_technology
include
:
_direct_hard_kmmlu_yaml
task
:
kmmlu_hard_direct_telecommunications_and_wireless_technology
lm_eval/tasks/kmmlu/
_default
_kmmlu_yaml
→
lm_eval/tasks/kmmlu/
hard/_hard
_kmmlu_yaml
View file @
90ad5db7
group: kmmlu
group:
dataset_path: HAERAE-HUB/K-MMLU-Preview
- kmmlu
- kmmlu_hard
dataset_path: HAERAE-HUB/KMMLU-HARD
output_type: multiple_choice
output_type: multiple_choice
training_split: train
validation_split: dev
test_split: test
test_split: test
fewshot_split: dev
fewshot_split: dev
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{A}}\nB. {{B}}\nC. {{C}}\nD. {{D}}\n정답:"
doc_to_text: "{{question.strip()}}\nA. {{A}}\nB. {{B}}\nC. {{C}}\nD. {{D}}\n정답:"
doc_to_choice: ["A", "B", "C", "D"]
doc_to_choice: ["A", "B", "C", "D"]
doc_to_target: "{{
['A', 'B', 'C', 'D'][
answer-1
]
}}"
doc_to_target: "{{answer-1}}"
metric_list:
metric_list:
- metric: acc
- metric: acc
aggregation: mean
aggregation: mean
...
@@ -17,4 +16,4 @@ metric_list:
...
@@ -17,4 +16,4 @@ metric_list:
aggregation: mean
aggregation: mean
higher_is_better: true
higher_is_better: true
metadata:
metadata:
version:
1.1
version:
2.0
lm_eval/tasks/kmmlu/hard/kmmlu_hard_accounting.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
accounting
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_accounting
lm_eval/tasks/kmmlu/hard/kmmlu_hard_agricultural_sciences.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
agricultural_sciences
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_agricultural_sciences
lm_eval/tasks/kmmlu/hard/kmmlu_hard_aviation_engineering_and_maintenance.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
aviation_engineering_and_maintenance
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_aviation_engineering_and_maintenance
lm_eval/tasks/kmmlu/hard/kmmlu_hard_biology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
biology
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_biology
lm_eval/tasks/kmmlu/hard/kmmlu_hard_chemical_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
chemical_engineering
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_chemical_engineering
lm_eval/tasks/kmmlu/hard/kmmlu_hard_chemistry.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
chemistry
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_chemistry
lm_eval/tasks/kmmlu/hard/kmmlu_hard_civil_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
civil_engineering
include
:
_hard_kmmlu_yaml
task
:
kmmlu_hard_civil_engineering
Prev
1
…
9
10
11
12
13
14
15
16
17
…
25
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment