Is Artificial Intelligence Ready for Emergency Department Triage? A Retrospective Evaluation of Multiple Large Language Models in 39,375 Patients at a University Emergency Department

Nedos, Ioannis; Zagalioti, Sofia-Chrysovalantou; Kofos, Christos; Katsikidou, Theoni; Vellidou, Dimitra; Astrinakis, Konstantinos; Karagiannis, Ioannis; Giannakopoulos, Panagiotis; Michaloudi, Styliani; Apostolopoulou, Aikaterini; Karagiannidis, Efstratios; Fyntanidou, Barbara

doi:10.3390/jcm15041512

articleJournal of Clinical MedicineFeb 14, 2026GOLD OA

Is Artificial Intelligence Ready for Emergency Department Triage? A Retrospective Evaluation of Multiple Large Language Models in 39,375 Patients at a University Emergency Department

INIoannis NedosSZSofia-Chrysovalantou ZagaliotiCKChristos Kofos TKTheoni Katsikidou DVDimitra Vellidou

AHEPA University Hospital

PubMed

Indexed incrossrefpubmed

Abstract

Background

Large language models (LLMs) are increasingly proposed as clinical decision support tools. However, their reliability in the emergency department (ED) triage remains insufficiently validated. This study aimed to evaluate the performance and limitations of multiple LLMs in triage using a large retrospective dataset.

Methods

We conducted a retrospective analysis of 39,375 anonymized patient cases from the ED of AHEPA University General Hospital, Thessaloniki, Greece (June 2024–July 2025), extracted from the hospital’s electronic medical record system. All cases were triaged in real time according to the Emergency Severity Index (ESI) by 25 emergency physicians. In cases of uncertainty, a senior emergency physician was consulted. Seven LLMs (ChatGPT-5 Thinking, ChatGPT-5 Instant, Gemini 2.5, Qwen 3, Grok 4.0, Deep Seek v3.1, and Claude Sonnet 4) were evaluated against the physician-assigned ESI level (reference standard). Outcomes included triage score agreement (quadratic weighted kappa, κw), clinic referral accuracy and admission prediction. Subgroup analyses were performed by referral clinic and admission outcome. The study was conducted in accordance with TRIPOD-AI reporting guidelines.

Citation impact

4

total citations

FWCI: 34.89
Percentile: 99%
References: 0

Too recent for citation history.

Authors

12

IN
Ioannis Nedos
AHEPA University Hospital
SZ
Sofia-Chrysovalantou Zagalioti
AHEPA University Hospital
CK
Christos Kofos
AHEPA University Hospital
TK
Theoni Katsikidou
AHEPA University Hospital
DV
Dimitra Vellidou
AHEPA University Hospital

Topics & keywords

Topics

Keywords

Triage
Emergency department
Referral
Retrospective cohort study
Sonnet
Medical record

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.