GPT versus Resident Physicians — A Benchmark Based on Official Board Scores

Katz, Uriel; Cohen, Eran; Shachar, Eliya; Somer, Jonathan; Fink, Adam Benjamin; Morse, E V; Shreiber, Beki; Wolf, Ido

doi:10.1056/aidbp2300192

articleNEJM AIApr 12, 2024BRONZE OA

GPT versus Resident Physicians — A Benchmark Based on Official Board Scores

UKUriel Katz ECEran Cohen ESEliya Shachar JSJonathan Somer ABAdam Benjamin Fink

Tel Aviv University · Tel Aviv Sourasky Medical Center · +6 more institutions

Indexed incrossref

Abstract

BACKGROUND Artificial intelligence (AI) is a burgeoning technological advancement, with considerable promise for influencing the field of medicine. As a preliminary step toward integrating AI into medical practice, it is imperative to ascertain whether model performance is comparable with that of physicians. We present a systematic comparison of performance by a large language model (LLM) versus that of a large cohort of physicians. The cohort includes all residents who took the medical specialist license examination in Israel in 2022 across the core medical disciplines: internal medicine, general surgery, pediatrics, psychiatry, and obstetrics and gynecology (OB/GYN). We provide the examinations as an…

Citation impact

127

total citations

FWCI: 13.64
Percentile: 100%
References: 18

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Benchmark (surveying)
Medicine
Psychology
Family medicine
Computer science
Medical education
Geography
Cartography

UN Sustainable Development Goals

Quality Education

No related works found for this paper.