Testing theory of mind in large language models and humans

Strachan, James W. A.; Albergo, Dalila; Borghini, Giulia; Pansardi, Oriana; Scaliti, Eugenio; Gupta, Saurabh; Saxena, K. B.; Rufo, Alessandro; Panzeri, Stefano; Manzi, Guido; Graziano, Michael S. A.; Becchio, Cristina

doi:10.1038/s41562-024-01882-z

articleNature Human BehaviourMay 20, 2024HYBRID OA

Testing theory of mind in large language models and humans

JWJames W. A. Strachan DADalila Albergo GBGiulia Borghini OPOriana Pansardi ESEugenio Scaliti

Universität Hamburg · University Medical Center Hamburg-Eppendorf · +4 more institutions

PubMed

Indexed incrossrefpubmed

Abstract

At the core of what defines us as humans is the concept of theory of mind: the ability to track other people's mental states. The recent development of large language models (LLMs) such as ChatGPT has led to intense debate about the possibility that these models exhibit behaviour that is indistinguishable from human behaviour in theory of mind tasks. Here we compare human and LLM performance on a comprehensive battery of measurements that aim to measure different theory of mind abilities, from understanding false beliefs to interpreting indirect requests and recognizing irony and faux pas. We tested two families of LLMs (GPT and LLaMA2) repeatedly against these measures and compared their performance with…

Citation impact

216

total citations

FWCI: 150.46
Percentile: 100%
References: 57

Citations per year

Authors

12

Topics & keywords

Topics

Keywords

Psychology
Cognitive science
Cognitive psychology
Computer science

UN Sustainable Development Goals

No poverty

No related works found for this paper.