Evaluating large language models as agents in the clinic

Mehandru, Nikita; Miao, Brenda Y.; Almaraz, Eduardo Rodriguez; Sushil, Madhumita; Butte, Atul J.; Alaa, Ahmed M.

doi:10.1038/s41746-024-01083-y

articlenpj Digital MedicineApr 3, 2024GOLD OA

Evaluating large language models as agents in the clinic

NMNikita Mehandru BYBrenda Y. Miao EREduardo Rodriguez Almaraz MSMadhumita Sushil AJAtul J. Butte

Hearst (United States) · University of California, Berkeley · +1 more institution

PubMed

Indexed incrossrefdoajpubmed

Abstract

Recent developments in large language models (LLMs) have unlocked opportunities for healthcare, from information synthesis to clinical decision support. These LLMs are not just capable of modeling language, but can also act as intelligent “agents” that interact with stakeholders in open-ended conversations and even influence clinical decision-making. Rather than relying on benchmarks that measure a model’s ability to process clinical data or answer standardized test questions, LLM agents can be modeled in high-fidelity simulations of clinical settings and should be assessed for their impact on clinical workflows. These evaluation frameworks, which we refer to as “Artificial Intelligence Structured Clinical…

Citation impact

117

total citations

FWCI: 12.94
Percentile: 100%
References: 28

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Workflow
Process (computing)
Computer science
Fidelity
Corporate governance
Clinical decision support system
Knowledge management
Data science

UN Sustainable Development Goals

Quality Education

No related works found for this paper.