Mathematical Capabilities of ChatGPT

Frieder, Simon; Pinchetti, Luca; Chevalier, Alexis; Griffiths, Ryan‐Rhys; Salvatori, Tommaso; Lukasiewicz, Thomas; Petersen, Philipp; Berner, Julius

doi:10.48550/arxiv.2301.13867

preprintarXiv (Cornell University)Jan 31, 2023GREEN OA

Mathematical Capabilities of ChatGPT

SFSimon Frieder LPLuca Pinchetti ACAlexis Chevalier RGRyan‐Rhys Griffiths TSTommaso Salvatori

Indexed inarxivdatacite

Abstract

We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathematical Library), current datasets of natural-language mathematics, used to benchmark language models, either cover only elementary mathematics or are very small. We address this by publicly releasing two new datasets: GHOSTS and miniGHOSTS. These are the first natural-language datasets curated by working researchers in mathematics that (1) aim to cover graduate-level…

Citation impact

298

total citations

FWCI: —
Percentile: —
References: 0

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Mathematical proof
Benchmark (surveying)
Computer science
Cover (algebra)
Range (aeronautics)
Base (topology)
Mathematical practice
Interface (matter)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.