Synthesizing scientific literature with retrieval-augmented language models
University of Washington · Allen Institute · +4 more institutions
Abstract
Scientific progress depends on the ability of researchers to synthesize the growing body of literature. Can large language models (LLMs) assist scientists in this task? Here we introduce OpenScholar, a specialized retrieval-augmented language model (LM)1 that answers scientific queries by identifying relevant passages from 45 million open-access papers and synthesizing citation-backed responses. To evaluate OpenScholar, we develop ScholarQABench, the first large-scale multi-domain benchmark for literature search, comprising 2,967 expert-written queries and 208 long-form answers across computer science, physics, neuroscience and biomedicine. Despite being a smaller open model, OpenScholar-8B outperforms GPT-4o…
Citation impact
- FWCI
- 74.17
- Percentile
- 100%
- References
- 44
Authors
28Topics & keywords
- Correctness
- Inference
- Benchmark (surveying)
- Language model
- Task (project management)
- Citation
- Quality Education