"official/resnet/imagenet_main.py" did not exist on "c96ef83658fffd25f961cfd7fd5e444f59868efa"
Unverified Commit c26a6ac7 authored by Hanwool Albert Lee's avatar Hanwool Albert Lee Committed by GitHub
Browse files

Added KMMLU evaluation method and changed ReadMe (#1447)



* update kmmlu default formatting

* Update _default_kmmlu_yaml

* Delete lm_eval/tasks/kmmlu/utils.py

* new tasks implemented

* add direct tasks

* update direct evaluate

* update direct eval

* add cot sample

* update cot

* add cot

* Update _cot_kmmlu_yaml

* add kmmlu90

* Update and rename _cot_kmmlu.yaml to _cot_kmmlu_yaml

* Create kmmlu90.yaml

* Update _cot_kmmlu_yaml

* add direct

* Update _cot_kmmlu_yaml

* Update and rename kmmlu90.yaml to kmmlu90_cot.yaml

* Update kmmlu90_direct.yaml

* add kmmlu hard

* Update _cot_kmmlu_yaml

* Update _cot_kmmlu_yaml

* update cot

* update cot

* erase typo

* Update _cot_kmmlu_yaml

* update cot

* Rename dataset to match k-mmlu-hard

* removed kmmlu90

* fixed name 'kmmlu_cot' to 'kmmlu_hard_cot' and revised README

* applied pre-commit before pull requests

* rename datasets and add notes

* Remove DS_Store cache

* Update lm_eval/tasks/kmmlu/README.md
Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Change citations and reflect reviews on version

* Added kmmlu_hard and fixed other errors

* fixing minor errors

* remove duplicated

* Rename files

* try ".index"

* minor fix

* minor fix again

* fix revert.

* minor fix. thank for hailey

---------
Co-authored-by: default avatarGUIJIN SON <spthsrbwls123@yonsei.ac.kr>
Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
parent 5ab295c8
"dataset_name": "Energy-Management"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_energy_management"
"dataset_name": "Environmental-Science"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_environmental_science"
"dataset_name": "Fashion"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_fashion"
"dataset_name": "Food-Processing"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_food_processing"
"dataset_name": "Gas-Technology-and-Engineering"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_gas_technology_and_engineering"
"dataset_name": "Geomatics"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_geomatics"
"dataset_name": "Health"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_health"
"dataset_name": "Industrial-Engineer"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_industrial_engineer"
"dataset_name": "Information-Technology"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_information_technology"
"dataset_name": "Interior-Architecture-and-Design"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_interior_architecture_and_design"
"dataset_name": "Law"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_law"
"dataset_name": "Machine-Design-and-Manufacturing"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_machine_design_and_manufacturing"
"dataset_name": "Management"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_management"
"dataset_name": "Maritime-Engineering"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_maritime_engineering"
"dataset_name": "Marketing"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_marketing"
"dataset_name": "Materials-Engineering"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_materials_engineering"
"dataset_name": "Mechanical-Engineering"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_mechanical_engineering"
"dataset_name": "Nondestructive-Testing"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_nondestructive_testing"
"dataset_name": "Patent"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_patent"
"dataset_name": "Political-Science-and-Sociology"
"include": "_default_kmmlu_yaml"
"task": "kmmlu_political_science_and_sociology"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment