Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis

Liu, Mingxin; Okuhara, Tsuyoshi; Chang, Xinyi; Shirabe, Ritsuko; Nishiie, Yuriko; Okada, Hiroko; Kiuchi, Takahiro

doi:10.2196/60807

reviewJournal of Medical Internet ResearchJun 15, 2024GOLD OA

Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis

MLMingxin Liu TOTsuyoshi Okuhara XCXinyi Chang RSRitsuko Shirabe YNYuriko Nishiie

University of Tokyo Health Sciences · Tokyo Institute of Technology

PubMed

Indexed incrossrefdoajpubmed

Abstract

Background

Over the past 2 years, researchers have used various medical licensing examinations to test whether ChatGPT (OpenAI) possesses accurate medical knowledge. The performance of each version of ChatGPT on the medical licensing examination in multiple environments showed remarkable differences. At this stage, there is still a lack of a comprehensive understanding of the variability in ChatGPT's performance on different medical licensing examinations.

Objective

In this study, we reviewed all studies on ChatGPT performance in medical licensing examinations up to March 2024. This review aims to contribute to the evolving discourse on artificial intelligence (AI) in medical education by providing a comprehensive analysis of the performance of ChatGPT in various environments. The insights gained from this systematic review will guide educators, policymakers, and technical experts to effectively and judiciously use AI in medical education.

Citation impact

185

total citations

FWCI: 19.69
Percentile: 100%
References: 68

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Preprint
Meta-analysis
MEDLINE
Peer review
Data science
Computer science
Medicine
World Wide Web

No related works found for this paper.