README.md 1.49 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# CareQA

### Paper

Title: `Automatic Evaluation of Healthcare LLMs Beyond Question-Answering`

Abstract: [https://arxiv.org/abs/2502.06666](https://arxiv.org/abs/2502.06666)

CareQA originates from the Spanish Specialised Healthcare Training (MIR) exams by the
Spanish Ministry of Health. The close-ended version is a multiple-choice question
answering (MCQA) including 5,621 QA pairs across six categories: medicine, nursing,
biology, chemistry, psychology, and pharmacology, sourced from the 2020 to 2024 exam
editions. CareQA is available in both English and Spanish. The open-ended version
(English only) contains 3,730 QA pairs.

Homepage: \
[https://huggingface.co/datasets/HPAI-BSC/CareQA](https://huggingface.co/datasets/HPAI-BSC/CareQA)


#### Tasks

* `careqa_en`: MCQA in english.
* `careqa_es`: MCQA in spanish.
* `careqa_open`: Open-Ended QA in english.
* `careqa_open_perplexity`: Open-Ended QA in english, evaluated with perplexity.

### Citation

```bibtex
@misc{ariasduart2025automaticevaluationhealthcarellms,
      title={Automatic Evaluation of Healthcare LLMs Beyond Question-Answering},
      author={Anna Arias-Duart and Pablo Agustin Martin-Torres and Daniel Hinjos and Pablo Bernabeu-Perez and Lucia Urcelay Ganzabal and Marta Gonzalez Mallo and Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Sergio Alvarez-Napagao and Dario Garcia-Gasulla},
      year={2025},
      eprint={2502.06666},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.06666},
}
```