Clinical entity augmented retrieval for clinical information extraction

López, Iván; Swaminathan, Akshay; Vedula, Karthik S.; Narayanan, Sanjana; Haredasht, Fateme Nateghi; Stephen, P.; Liang, April S.; Tate, Steven; Maddali, Manoj; Gallo, Robert J.; Shah, Nigam H.; Chen, Jonathan H.

doi:10.1038/s41746-024-01377-1

articlenpj Digital MedicineJan 19, 2025GOLD OA

Clinical entity augmented retrieval for clinical information extraction

ILIván López ASAkshay Swaminathan KSKarthik S. Vedula SNSanjana Narayanan FNFateme Nateghi Haredasht

Stanford Medicine · Stanford University · +3 more institutions

PubMed

Indexed incrossrefdoajpubmed

Abstract

Large language models (LLMs) with retrieval-augmented generation (RAG) have improved information extraction over previous methods, yet their reliance on embeddings often leads to inefficient retrieval. We introduce CLinical Entity Augmented Retrieval (CLEAR), a RAG pipeline that retrieves information using entities. We compared CLEAR to embedding RAG and full-note approaches for extracting 18 variables using six LLMs across 20,000 clinical notes. Average F1 scores were 0.90, 0.86, and 0.79; inference times were 4.95, 17.41, and 20.08 s per note; average model queries were 1.68, 4.94, and 4.18 per note; and average input tokens were 1.1k, 3.8k, and 6.1k per note for CLEAR, embedding RAG, and full-note…

Citation impact

44

total citations

FWCI: 27.11
Percentile: 100%
References: 69

Citations per year

Authors

12

Topics & keywords

Topics

Keywords

Pipeline (software)
Computer science
Inference
Embedding
Security token
Information retrieval
Information extraction
Artificial intelligence

UN Sustainable Development Goals

Quality Education

No related works found for this paper.