LLMs achieve adult human performance on higher-order theory of mind tasks

Street, W; Siy, JO; Keeling, G; Baranes, A; Barnett, B; McKibben, M; Kanyere, T; Lentz, A; Arcas, BAY; Dunbar, RIM

Journal article

LLMs achieve adult human performance on higher-order theory of mind tasks

Abstract:: This paper examines the extent to which large language models (LLMs) are able to perform tasks which require higher-order theory of mind (ToM)—the human ability to reason about multiple mental and emotional states in a recursive manner (e.g., I think that you believe that she knows). This paper builds on prior work by introducing a handwritten test suite—Multi-Order Theory of Mind Q&A—and using it to compare the performance of five LLMs of varying sizes and training paradigms to a newly gathered adult human benchmark. We find that GPT-4 and Flan-PaLM reach adult-level and near adult-level performance on our ToM tasks overall, and that GPT-4 exceeds adult performance on 6th order inferences. Our results suggest that there is an interplay between model size and finetuning for higher-order ToM performance, and that the linguistic abilities of large models may support more complex ToM inferences. Given the important role that higher-order ToM plays in group social interaction and relationships, these findings have significant implications for the development of a broad range of social, educational and assistive LLM applications.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Street, W., Siy, J. O., Keeling, G., Baranes, A., Barnett, B., McKibben, M., Kanyere, T., Lentz, A., Arcas, B. A. Y., & Dunbar, R. I. M. (2026). LLMs achieve adult human performance on higher-order theory of mind tasks. Frontiers in Human Neuroscience, 19.

MLA Style

Street, W, et al. “LLMs Achieve Adult Human Performance on Higher-Order Theory of Mind Tasks.” Frontiers in Human Neuroscience, vol. 19, 2026.

Chicago Style

Street, W, JO Siy, G Keeling, et al. 2026. “LLMs Achieve Adult Human Performance on Higher-Order Theory of Mind Tasks.” Frontiers in Human Neuroscience 19.
Print

Access Document

Files:: Street_et_al_2026_LLMs_achieve_adult.pdf

(Preview, Version of record, pdf, 731.7KB, Terms of use)

Street_et_al_2026_LLMs_achieve_adult_Supplementary_materials.zip

(Supplementary materials, zip, 322.1KB, Terms of use)

Publisher copy:: 10.3389/fnhum.2025.1633272

Authors

+ Street, W More by this author

Role:: Author

+ Siy, JO More by this author

Role:: Author

+ Keeling, G More by this author

Role:: Author

+ Baranes, A More by this author

Role:: Author

+ Barnett, B More by this author

Role:: Author

More authors...

Publisher:: Frontiers Media
Journal:: Frontiers in Human Neuroscience More from this journal
Volume:: 19
Article number:: 1633272
Publication date:: 2026-01-02
Acceptance date:: 2025-10-21
DOI:: 10.3389/fnhum.2025.1633272
EISSN:: 1662-5161
ISSN:: 1662-5161

Language:: English
Keywords:: mentalizing

social AI

large language models

AI

social cognition

theory of mind
Pubs id:: 2361380
UUID:: uuid_f784d950-b093-4e7d-8529-550d0a62e9e7
Local pid:: pubs:2361380
Source identifiers:: 3667551
Deposit date:: 2026-01-16
ARK identifier:: ark:/29072/ora_f784d950b0934e7d8529550d0a62e9e7

Terms of use

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

LLMs achieve adult human performance on higher-order theory of mind tasks

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

LLMs achieve adult human performance on higher-order theory of mind tasks

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions