README.md 2.4 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# KorMedMCQA

### Paper

Title: `KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations`

Abstract: `We introduce KorMedMCQA, the first Korean multiple-choice question answering (MCQA) benchmark derived from Korean healthcare professional licensing examinations, covering from the year 2012 to year 2023. This dataset consists of a selection of questions from the license examinations for doctors, nurses, and pharmacists, featuring a diverse array of subjects. We conduct baseline experiments on various large language models, including proprietary/open-source, multilingual/Korean-additional pretrained, and clinical context pretrained models, highlighting the potential for further enhancements. We make our data publicly available on HuggingFace and provide a evaluation script via LM-Harness, inviting further exploration and advancement in Korean healthcare environments.`


Paper : https://arxiv.org/abs/2403.01469

Homepage: https://huggingface.co/datasets/sean0042/KorMedMCQA


### Citation

```
@article{kweon2024kormedmcqa,
      title={KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations},
      author={Sunjun Kweon and Byungjin Choi and Minkyu Kim and Rae Woong Park and Edward Choi},
      journal={arXiv preprint arXiv:2403.01469},
      year={2024}
}
```

### Groups and Tasks

Gyouk Chu's avatar
Gyouk Chu committed
28
* `kormedmcqa`: Runs `kormedmcqa_doctor`, `kormedmcqa_nurse`, `kormedmcqa_pharm`, and `kormedmcqa_dentist`.
29
30
31
32
33
34

#### Tasks

* `kormedmcqa_doctor`: `Official Korean Doctor Examination`
* `kormedmcqa_nurse`: `Official Korean Nurse Examination`
* `kormedmcqa_pharm`: `Official Korean Pharmacist Examination`
Gyouk Chu's avatar
Gyouk Chu committed
35
* `kormedmcqa_dentist`: `Official Korean Dentist Examination`
36
37
38
39

### Checklist

For adding novel benchmarks/datasets to the library:
Gyouk Chu's avatar
Gyouk Chu committed
40
41
42
* [ ] Is the task an existing benchmark in the literature?
  * [ ] Have you referenced the original paper that introduced the task?
  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
43
44
45
46
47
48


If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?