Journal article icon

Journal article

LLMs achieve adult human performance on higher-order theory of mind tasks

Abstract:
This paper examines the extent to which large language models (LLMs) are able to perform tasks which require higher-order theory of mind (ToM)—the human ability to reason about multiple mental and emotional states in a recursive manner (e.g., I think that you believe that she knows). This paper builds on prior work by introducing a handwritten test suite—Multi-Order Theory of Mind Q&A—and using it to compare the performance of five LLMs of varying sizes and training paradigms to a newly gathered adult human benchmark. We find that GPT-4 and Flan-PaLM reach adult-level and near adult-level performance on our ToM tasks overall, and that GPT-4 exceeds adult performance on 6th order inferences. Our results suggest that there is an interplay between model size and finetuning for higher-order ToM performance, and that the linguistic abilities of large models may support more complex ToM inferences. Given the important role that higher-order ToM plays in group social interaction and relationships, these findings have significant implications for the development of a broad range of social, educational and assistive LLM applications.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.3389/fnhum.2025.1633272

Authors


Publisher:
Frontiers Media
Journal:
Frontiers in Human Neuroscience More from this journal
Volume:
19
Article number:
1633272
Publication date:
2026-01-02
Acceptance date:
2025-10-21
DOI:
EISSN:
1662-5161
ISSN:
1662-5161


Language:
English
Keywords:
Pubs id:
2361380
UUID:
uuid_f784d950-b093-4e7d-8529-550d0a62e9e7
Local pid:
pubs:2361380
Source identifiers:
3667551
Deposit date:
2026-01-16
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP