# CareQA ### Paper Title: `Automatic Evaluation of Healthcare LLMs Beyond Question-Answering` Abstract: [https://arxiv.org/abs/2502.06666](https://arxiv.org/abs/2502.06666) CareQA originates from the Spanish Specialised Healthcare Training (MIR) exams by the Spanish Ministry of Health. The close-ended version is a multiple-choice question answering (MCQA) including 5,621 QA pairs across six categories: medicine, nursing, biology, chemistry, psychology, and pharmacology, sourced from the 2020 to 2024 exam editions. CareQA is available in both English and Spanish. The open-ended version (English only) contains 3,730 QA pairs. Homepage: \ [https://huggingface.co/datasets/HPAI-BSC/CareQA](https://huggingface.co/datasets/HPAI-BSC/CareQA) #### Tasks * `careqa_en`: MCQA in english. * `careqa_es`: MCQA in spanish. * `careqa_open`: Open-Ended QA in english. * `careqa_open_perplexity`: Open-Ended QA in english, evaluated with perplexity. ### Citation ```bibtex @misc{ariasduart2025automaticevaluationhealthcarellms, title={Automatic Evaluation of Healthcare LLMs Beyond Question-Answering}, author={Anna Arias-Duart and Pablo Agustin Martin-Torres and Daniel Hinjos and Pablo Bernabeu-Perez and Lucia Urcelay Ganzabal and Marta Gonzalez Mallo and Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Sergio Alvarez-Napagao and Dario Garcia-Gasulla}, year={2025}, eprint={2502.06666}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.06666}, } ```