ChatGPT Health performance in a structured test of triage recommendations

Ramaswamy, Ashwin; Tyagi, Alvira; Hugo, Hannah; Jiang, Joy; Jayaraman, Pushkala; Jangda, Mateen; Te, Alexis E.; Kaplan, Steven A.; Lampert, Joshua; Freeman, Robert; Gavin, Nicholas; Tewari, Ashutosh; Sakhuja, Ankit; Naved, Bilal; Charney, Alexander W.; Omar, Mahmud; Gorin, Michael A.; Klang, Eyal; Nadkarni, Girish N.

doi:10.1038/s41591-026-04297-7

articleNature MedicineFeb 23, 2026HYBRID OA

ChatGPT Health performance in a structured test of triage recommendations

ARAshwin Ramaswamy ATAlvira Tyagi HHHannah Hugo JJJoy Jiang PJPushkala Jayaraman

Mount Sinai Health System · University of Miami

PubMed

Indexed incrossrefpubmed

Abstract

ChatGPT Health was launched in January 2026 as OpenAI's consumer health tool and has reached millions of users. Here we conducted a structured stress test of triage recommendations using 60 clinician-authored vignettes across 21 clinical domains under 16 factorial conditions, yielding 960 total responses. Performance followed an inverted U-shaped pattern, with the most dangerous failures concentrated at clinical extremes-nonurgent presentations (35%) and emergency conditions (48%). Among gold-standard emergencies, the system undertriaged 52% of cases, directing patients with diabetic ketoacidosis or impending respiratory failure to 24-48 h evaluation rather than the emergency department, while correctly…

Citation impact

17

total citations

FWCI: 142.74
Percentile: 100%
References: 16

Too recent for citation history.

Authors

19

Topics & keywords

Topics

Keywords

Triage
Test (biology)
Emergency department
Health care
Intervention (counseling)
Prospective cohort study
Emergency medical services

UN Sustainable Development Goals

Zero hunger

No related works found for this paper.