Large Language Model Influence on Diagnostic Reasoning

Goh, Ethan; Gallo, Robert; Hom, Jason; Strong, Eric; Weng, Yingjie; Kerman, Hannah; Cool, Joséphine A.; Kanjee, Zahir; Parsons, Andrew S.; Ahuja, Neera; Horvitz, Eric; Yang, Daniel X.; Milstein, Arnold; Olson, Andrew; Rodman, Adam; Chen, Jonathan H.

doi:10.1001/jamanetworkopen.2024.40969

articleJAMA Network OpenOct 28, 2024GOLD OA

Large Language Model Influence on Diagnostic Reasoning

EGEthan Goh RGRobert Gallo JHJason Hom ESEric Strong YWYingjie Weng

Stanford University · VA Palo Alto Health Care System · +8 more institutions

PubMed

Indexed incrossrefdoajpubmed

Abstract

Importance

Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves physician diagnostic reasoning.

Objective

To assess the effect of an LLM on physicians' diagnostic reasoning compared with conventional resources. Design, Setting, and Participants: A single-blind randomized clinical trial was conducted from November 29 to December 29, 2023. Using remote video conferencing and in-person participation across multiple academic medical institutions, physicians with training in family medicine, internal medicine, or emergency medicine were recruited. Intervention: Participants were randomized to either access the LLM in addition to conventional diagnostic resources or conventional resources only, stratified by career stage. Participants were allocated 60 minutes to review up to 6 clinical vignettes. Main Outcomes and Measures: The primary outcome was performance on a standardized rubric of diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps, validated and graded via blinded expert consensus. Secondary outcomes included time spent per case (in seconds) and final diagnosis accuracy. All analyses followed the intention-to-treat principle. A secondary exploratory analysis evaluated the standalone performance of the LLM by comparing the primary outcomes between the LLM alone group and the conventional resource group.

Citation impact

505

total citations

FWCI: 220.69
Percentile: 100%
References: 41

Citations per year

Authors

16

Topics & keywords

Topics

Keywords

Rubric
Medicine
Randomized controlled trial
MEDLINE
Intervention (counseling)
Physical therapy
Family medicine
Medical physics

No related works found for this paper.