Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
90ad5db7
Commit
90ad5db7
authored
Mar 01, 2024
by
lintangsutawika
Browse files
merged main
parents
f692caa9
b177c82c
Changes
484
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
84 additions
and
0 deletions
+84
-0
lm_eval/tasks/kmmlu/direct/_direct_kmmlu_yaml
lm_eval/tasks/kmmlu/direct/_direct_kmmlu_yaml
+27
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_accounting.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_accounting.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_agricultural_sciences.yaml
...asks/kmmlu/direct/kmmlu_direct_agricultural_sciences.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_aviation_engineering_and_maintenance.yaml
...ct/kmmlu_direct_aviation_engineering_and_maintenance.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_biology.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_biology.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_chemical_engineering.yaml
...tasks/kmmlu/direct/kmmlu_direct_chemical_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_chemistry.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_chemistry.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_civil_engineering.yaml
...al/tasks/kmmlu/direct/kmmlu_direct_civil_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_computer_science.yaml
...val/tasks/kmmlu/direct/kmmlu_direct_computer_science.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_construction.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_construction.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_criminal_law.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_criminal_law.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_ecology.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_ecology.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_economics.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_economics.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_education.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_education.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_electrical_engineering.yaml
...sks/kmmlu/direct/kmmlu_direct_electrical_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_electronics_engineering.yaml
...ks/kmmlu/direct/kmmlu_direct_electronics_engineering.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_energy_management.yaml
...al/tasks/kmmlu/direct/kmmlu_direct_energy_management.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_environmental_science.yaml
...asks/kmmlu/direct/kmmlu_direct_environmental_science.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_fashion.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_fashion.yaml
+3
-0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_food_processing.yaml
lm_eval/tasks/kmmlu/direct/kmmlu_direct_food_processing.yaml
+3
-0
No files found.
lm_eval/tasks/kmmlu/direct/_direct_kmmlu_yaml
0 → 100644
View file @
90ad5db7
group:
- kmmlu
- kmmlu_direct
dataset_path: HAERAE-HUB/KMMLU
output_type: generate_until
test_split: test
fewshot_split: dev
doc_to_text: "{{question.strip()}}\nA. {{A}}\nB. {{B}}\nC. {{C}}\nD. {{D}}\n정답:"
doc_to_target: "{{['A', 'B', 'C', 'D'][answer-1]}}"
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
regexes_to_ignore:
- " "
generation_kwargs:
until:
- "Q:"
- "\n\n"
- "</s>"
- "."
do_sample: false
temperature: 0.0
metadata:
version: 2.0
lm_eval/tasks/kmmlu/direct/kmmlu_direct_accounting.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Accounting
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_accounting
lm_eval/tasks/kmmlu/direct/kmmlu_direct_agricultural_sciences.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Agricultural-Sciences
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_agricultural_sciences
lm_eval/tasks/kmmlu/direct/kmmlu_direct_aviation_engineering_and_maintenance.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Aviation-Engineering-and-Maintenance
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_aviation_engineering_and_maintenance
lm_eval/tasks/kmmlu/direct/kmmlu_direct_biology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Biology
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_biology
lm_eval/tasks/kmmlu/direct/kmmlu_direct_chemical_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Chemical-Engineering
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_chemical_engineering
lm_eval/tasks/kmmlu/direct/kmmlu_direct_chemistry.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Chemistry
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_chemistry
lm_eval/tasks/kmmlu/direct/kmmlu_direct_civil_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Civil-Engineering
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_civil_engineering
lm_eval/tasks/kmmlu/direct/kmmlu_direct_computer_science.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Computer-Science
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_computer_science
lm_eval/tasks/kmmlu/direct/kmmlu_direct_construction.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Construction
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_construction
lm_eval/tasks/kmmlu/direct/kmmlu_direct_criminal_law.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Criminal-Law
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_criminal_law
lm_eval/tasks/kmmlu/direct/kmmlu_direct_ecology.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Ecology
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_ecology
lm_eval/tasks/kmmlu/direct/kmmlu_direct_economics.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Economics
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_economics
lm_eval/tasks/kmmlu/direct/kmmlu_direct_education.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Education
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_education
lm_eval/tasks/kmmlu/direct/kmmlu_direct_electrical_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Electrical-Engineering
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_electrical_engineering
lm_eval/tasks/kmmlu/direct/kmmlu_direct_electronics_engineering.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Electronics-Engineering
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_electronics_engineering
lm_eval/tasks/kmmlu/direct/kmmlu_direct_energy_management.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Energy-Management
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_energy_management
lm_eval/tasks/kmmlu/direct/kmmlu_direct_environmental_science.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Environmental-Science
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_environmental_science
lm_eval/tasks/kmmlu/direct/kmmlu_direct_fashion.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Fashion
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_fashion
lm_eval/tasks/kmmlu/direct/kmmlu_direct_food_processing.yaml
0 → 100644
View file @
90ad5db7
dataset_name
:
Food-Processing
include
:
_direct_kmmlu_yaml
task
:
kmmlu_direct_food_processing
Prev
1
…
5
6
7
8
9
10
11
12
13
…
25
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment