Generalization bias in large language model summarization of scientific research

Peters, Uwe; Chin‐Yee, Benjamin

doi:10.1098/rsos.241776

articleRoyal Society Open ScienceApr 1, 2025GOLD OA

Generalization bias in large language model summarization of scientific research

UPUwe Peters BCBenjamin Chin‐Yee

Utrecht University · Western University · +1 more institution

PubMed

Indexed incrossrefdoajpubmed

Abstract

Artificial intelligence chatbots driven by large language models (LLMs) have the potential to increase public science literacy and support scientific research, as they can quickly summarize complex scientific information in accessible terms. However, when summarizing scientific texts, LLMs may omit details that limit the scope of research conclusions, leading to generalizations of results broader than warranted by the original study. We tested 10 prominent LLMs, including ChatGPT-4o, ChatGPT-4.5, DeepSeek, LLaMA 3.3 70B, and Claude 3.7 Sonnet, comparing 4900 LLM-generated summaries to their original scientific texts. Even when explicitly prompted for accuracy, most LLMs produced broader generalizations of…

Citation impact

72

total citations

FWCI: 136.18
Percentile: 100%
References: 47

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Automatic summarization
Generalization
Computer science
Natural language processing
Artificial intelligence
Language model
Epistemology
Philosophy

No related works found for this paper.