Galactica: A Large Language Model for Science
Indexed inarxivdatacite
Abstract
Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by…
Citation impact
259
total citations
- FWCI
- —
- Percentile
- —
- References
- 0
Citations per year
Authors
9Topics & keywords
Topics
Keywords
- Computer science
- Obstacle
- Sociology of scientific knowledge
- Language model
- Data science
- Artificial intelligence
- Sociology
- Social science
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.