Large Language Models for Chatbot Health Advice Studies

Huo, Bright; Boyle, Amy; Marfo, Nana; Tangamornsuksan, Wimonchat; Steen, Jeremy; McKechnie, Tyler; Lee, Yung; Mayol, Julio; Antoniou, Stavros A.; Thirunavukarasu, Arun James; Sanger, Stephanie; Ramji, Karim; Guyatt, Gordon

doi:10.1001/jamanetworkopen.2024.57879

reviewJAMA Network OpenFeb 4, 2025GOLD OA

Large Language Models for Chatbot Health Advice Studies

BHBright Huo ABAmy Boyle NMNana Marfo WTWimonchat Tangamornsuksan JSJeremy Steen

McMaster University · California Miramar University · +7 more institutions

PubMed

Indexed incrossrefdoajpubmed

Abstract

Importance

There is much interest in the clinical integration of large language models (LLMs) in health care. Many studies have assessed the ability of LLMs to provide health advice, but the quality of their reporting is uncertain.

Objective

To perform a systematic review to examine the reporting variability among peer-reviewed studies evaluating the performance of generative artificial intelligence (AI)-driven chatbots for summarizing evidence and providing health advice to inform the development of the Chatbot Assessment Reporting Tool (CHART). Evidence Review: A search of MEDLINE via Ovid, Embase via Elsevier, and Web of Science from inception to October 27, 2023, was conducted with the help of a health sciences librarian to yield 7752 articles. Two reviewers screened articles by title and abstract followed by full-text review to identify primary studies evaluating the clinical accuracy of generative AI-driven chatbots in providing health advice (chatbot health advice studies). Two reviewers then performed data extraction for 137 eligible studies.

Citation impact

135

total citations

FWCI: 64.27
Percentile: 100%
References: 186

Citations per year

Authors

13

Topics & keywords

Topics

Keywords

Advice (programming)
Chatbot
MEDLINE
Data extraction
Medicine
Health care
Medical education
Computer science

No related works found for this paper.