Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

Brin, Dana; Sorin, Vera; Vaid, Akhil; Soroush, Ali; Glicksberg, Benjamin S.; Charney, Alexander W.; Nadkarni, Girish N.; Klang, Eyal

doi:10.1038/s41598-023-43436-9

articleScientific ReportsOct 1, 2023GOLD OA

Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

DBDana Brin VSVera Sorin AVAkhil Vaid ASAli Soroush BSBenjamin S. Glicksberg

Tel Aviv University · Sheba Medical Center · +1 more institution

Indexed incrossrefdoaj

Abstract

Abstract The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models’ consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT’s 62.5%. GPT-4…

Citation impact

318

total citations

FWCI: 11.49
Percentile: 100%
References: 14

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Empathy
Consistency (knowledge bases)
Soft skills
Interpersonal communication
United States Medical Licensing Examination
Medical education
Psychology
Computer science

UN Sustainable Development Goals

Quality Education

No related works found for this paper.