".github/vscode:/vscode.git/clone" did not exist on "2387c22b5614288987ae35aef4fe344e852be77f"
Unverified Commit c26a6ac7 authored by Hanwool Albert Lee's avatar Hanwool Albert Lee Committed by GitHub
Browse files

Added KMMLU evaluation method and changed ReadMe (#1447)



* update kmmlu default formatting

* Update _default_kmmlu_yaml

* Delete lm_eval/tasks/kmmlu/utils.py

* new tasks implemented

* add direct tasks

* update direct evaluate

* update direct eval

* add cot sample

* update cot

* add cot

* Update _cot_kmmlu_yaml

* add kmmlu90

* Update and rename _cot_kmmlu.yaml to _cot_kmmlu_yaml

* Create kmmlu90.yaml

* Update _cot_kmmlu_yaml

* add direct

* Update _cot_kmmlu_yaml

* Update and rename kmmlu90.yaml to kmmlu90_cot.yaml

* Update kmmlu90_direct.yaml

* add kmmlu hard

* Update _cot_kmmlu_yaml

* Update _cot_kmmlu_yaml

* update cot

* update cot

* erase typo

* Update _cot_kmmlu_yaml

* update cot

* Rename dataset to match k-mmlu-hard

* removed kmmlu90

* fixed name 'kmmlu_cot' to 'kmmlu_hard_cot' and revised README

* applied pre-commit before pull requests

* rename datasets and add notes

* Remove DS_Store cache

* Update lm_eval/tasks/kmmlu/README.md
Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Change citations and reflect reviews on version

* Added kmmlu_hard and fixed other errors

* fixing minor errors

* remove duplicated

* Rename files

* try ".index"

* minor fix

* minor fix again

* fix revert.

* minor fix. thank for hailey

---------
Co-authored-by: default avatarGUIJIN SON <spthsrbwls123@yonsei.ac.kr>
Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
parent 5ab295c8
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment