reviewBMC Medical Informatics and Decision MakingMar 12, 2024GOLD OA

Assessing the research landscape and clinical utility of large language models: a scoping review

University of New Brunswick · University of Toronto · +2 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Importance

Large language models (LLMs) like OpenAI's ChatGPT are powerful generative systems that rapidly synthesize natural language responses. Research on LLMs has revealed their potential and pitfalls, especially in clinical settings. However, the evolving landscape of LLM research in medicine has left several gaps regarding their evaluation, application, and evidence base.

Objective

This scoping review aims to (1) summarize current research evidence on the accuracy and efficacy of LLMs in medical applications, (2) discuss the ethical, legal, logistical, and socioeconomic implications of LLM use in clinical settings, (3) explore barriers and facilitators to LLM implementation in healthcare, (4) propose a standardized evaluation framework for assessing LLMs' clinical utility, and (5) identify evidence gaps and propose future research directions for LLMs in clinical applications. EVIDENCE REVIEW: We screened 4,036 records from MEDLINE, EMBASE, CINAHL, medRxiv, bioRxiv, and arXiv from January 2023 (inception of the search) to June 26, 2023 for English-language papers and analyzed findings from 55 worldwide studies. Quality of evidence was reported based on the Oxford Centre for Evidence-based Medicine recommendations.

Citation impact

161
total citations
FWCI
17.20
Percentile
100%
References
64
Citations per year

Authors

7

Topics & keywords

Keywords
  • CINAHL
  • MEDLINE
  • Health care
  • Medicine
  • Socioeconomic status
  • Political science
  • Environmental health
  • Population
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.