Transforming literature screening: The emerging role of large language models in systematic reviews

Delgado-Chaves, Fernando M.; Jennings, Matthew J.; Atalaia, António; Wolff, Justus; Horváth, Rita; Mamdouh, Zeinab M.; Baumbach, Jan; Baumbach, Linda

doi:10.1073/pnas.2411962122

articleProceedings of the National Academy of SciencesJan 6, 2025HYBRID OA

Transforming literature screening: The emerging role of large language models in systematic reviews

FMFernando M. Delgado-Chaves MJMatthew J. Jennings AAAntónio Atalaia JWJustus Wolff RHRita Horváth

Universität Hamburg · Columbia University · +10 more institutions

PubMed

Indexed incrossrefpubmed

Abstract

Systematic reviews (SR) synthesize evidence-based medical literature, but they involve labor-intensive manual article screening. Large language models (LLMs) can select relevant literature, but their quality and efficacy are still being determined compared to humans. We evaluated the overlap between title- and abstract-based selected articles of 18 different LLMs and human-selected articles for three SR. In the three SRs, 185/4,662, 122/1,741, and 45/66 articles have been selected and considered for full-text screening by two independent reviewers. Due to technical variations and the inability of the LLMs to classify all records, the LLM's considered sample sizes were smaller. However, on average, the 18 LLMs…

Citation impact

51

total citations

FWCI: 51.83
Percentile: 100%
References: 29

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Workload
Inclusion (mineral)
Medicine
Psychology
Computer science
Social psychology

UN Sustainable Development Goals

Reduced inequalities

No related works found for this paper.