AI versus human-generated multiple-choice questions for medical education: a cohort study in a high-stakes examination
Chinese University of Hong Kong · Hong Kong College of Technology · +2 more institutions
Indexed incrossrefdoajpubmed
Abstract
Background
The creation of high-quality multiple-choice questions (MCQs) is essential for medical education assessments but is resource-intensive and time-consuming when done by human experts. Large language models (LLMs) like ChatGPT-4o offer a promising alternative, but their efficacy remains unclear, particularly in high-stakes exams.
Objective
This study aimed to evaluate the quality and psychometric properties of ChatGPT-4o-generated MCQs compared to human-created MCQs in a high-stakes medical licensing exam.
Citation impact
60
total citations
- FWCI
- 28.72
- Percentile
- 100%
- References
- 27
Citations per year
Authors
7Topics & keywords
Topics
Keywords
- Multiple choice
- Medical education
- Educational measurement
- Cohort
- Medicine
- MEDLINE
- Psychology
- Curriculum
No related works found for this paper.