Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness

Ke, Yu He; Jin, Liyuan; Elangovan, Kabilan; Abdullah, Hairil Rizal; Liu, Nan; Sia, Alex Tiong Heng; Soh, Chai Rick; Tung, Joshua Yi Min; Ong, Jasmine Chiat Ling; Kuo, Chang‐Fu; Wu, Shaochun; Kovacheva, Vesela; Ting, Daniel Shu Wei

doi:10.1038/s41746-025-01519-z

articlenpj Digital MedicineApr 5, 2025GOLD OA

Retrieval augmented generation for 10 large language models and its generalizability in assessing medical fitness

YHYu He Ke LJLiyuan Jin KEKabilan Elangovan HRHairil Rizal Abdullah NLNan Liu

Singapore General Hospital · Singapore National Eye Center · +10 more institutions

PubMed

Indexed incrossrefdoajpubmed

Abstract

Large Language Models (LLMs) hold promise for medical applications but often lack domain-specific expertise. Retrieval Augmented Generation (RAG) enables customization by integrating specialized knowledge. This study assessed the accuracy, consistency, and safety of LLM-RAG models in determining surgical fitness and delivering preoperative instructions using 35 local and 23 international guidelines. Ten LLMs (e.g., GPT3.5, GPT4, GPT4o, Gemini, Llama2, and Llama3, Claude) were tested across 14 clinical scenarios. A total of 3234 responses were generated and compared to 448 human-generated answers. The GPT4 LLM-RAG model with international guidelines generated answers within 20 s and achieved the highest…

Citation impact

89

total citations

FWCI: 42.37
Percentile: 100%
References: 20

Citations per year

Authors

13

Topics & keywords

Topics

Keywords

Generalizability theory
Computer science
Natural language processing
Psychology
Artificial intelligence
Information retrieval
Developmental psychology

No related works found for this paper.

Funding

SG
Singapore General Hospital