articleNatureFeb 4, 2026HYBRID OA

Synthesizing scientific literature with retrieval-augmented language models

University of Washington · Allen Institute · +4 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

Scientific progress depends on the ability of researchers to synthesize the growing body of literature. Can large language models (LLMs) assist scientists in this task? Here we introduce OpenScholar, a specialized retrieval-augmented language model (LM)1 that answers scientific queries by identifying relevant passages from 45 million open-access papers and synthesizing citation-backed responses. To evaluate OpenScholar, we develop ScholarQABench, the first large-scale multi-domain benchmark for literature search, comprising 2,967 expert-written queries and 208 long-form answers across computer science, physics, neuroscience and biomedicine. Despite being a smaller open model, OpenScholar-8B outperforms GPT-4o…

Citation impact

9
total citations
FWCI
74.17
Percentile
100%
References
44
Citations per year

Authors

28

Topics & keywords

Keywords
  • Correctness
  • Inference
  • Benchmark (surveying)
  • Language model
  • Task (project management)
  • Citation
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding