A systematic review of large language model (LLM) evaluations in clinical medicine

Shool, Sina; Adimi, Sara; Amleshi, Reza Saboori; Bitaraf, Ehsan; Golpira, Reza; Tara, Mahmood

doi:10.1186/s12911-025-02954-4

reviewBMC Medical Informatics and Decision MakingMar 7, 2025GOLD OA

A systematic review of large language model (LLM) evaluations in clinical medicine

SSSina Shool SASara Adimi RSReza Saboori Amleshi EBEhsan Bitaraf RGReza Golpira

Iran University of Medical Sciences · Shaheed Rajaei Cardiovascular Medical and Research Center

PubMed

Indexed incrossrefdoajpubmed

Abstract

Background

Large Language Models (LLMs), advanced AI tools based on transformer architectures, demonstrate significant potential in clinical medicine by enhancing decision support, diagnostics, and medical education. However, their integration into clinical workflows requires rigorous evaluation to ensure reliability, safety, and ethical alignment.

Objective

This systematic review examines the evaluation parameters and methodologies applied to LLMs in clinical medicine, highlighting their capabilities, limitations, and application trends.

Citation impact

215

total citations

FWCI: 102.35
Percentile: 100%
References: 16

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Health informatics
Computer science
Medicine
Medical education
Public health
Nursing

UN Sustainable Development Goals

Quality Education

No related works found for this paper.