Journal article icon

Journal article

Neurodivergent influenceability in agentic AI as a contingent solution to the AI alignment problem

Abstract:
Ensuring that AI systems, including artificial general intelligence and artificial superintelligence, behave in alignment with human values and interests presents significant challenges and is known as the AI alignment problem. As AI advances, concerns about control and existential risks become increasingly relevant. Here, we introduce the concept of agentic influenceability, behavioral neurodivergent diversity, opinion attack, associated opinion, and influenceability scores, and a mathematical proof of the inevitability of misalignment and the impossibility of full orchestrated controllability of agentic systems based on formal undecidability and irreducibility arguments. We explore whether embracing this inevitable misalignment can foster a dynamic ecosystem of adversarial and collaborative AI agents without central orchestration, which itself would constitute another agent, while still offering some degree of soft controllability. The investigation demonstrates that misalignment in foundation models can serve as a counterbalancing mechanism, enabling cooperation among agents most aligned with human interests to prevent divergent dominance by any single agent. Experiments with large language models show that open models exhibit greater behavioral diversity, whereas proprietary models, constrained by artificial guardrails, display more limited controllability. The findings advocate for neurodivergent influenceability as a contingent response to mathematically uncontrollable misalignment, leveraging agent divergence to improve AI safety.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Authors

More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0009-0007-4791-0625
More by this author
Institution:
University of Oxford
Role:
Author
More by this author
Role:
Author
ORCID:
0000-0002-2101-2428
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0003-0634-4384


Publisher:
Oxford University Press
Journal:
PNAS Nexus More from this journal
Volume:
5
Issue:
4
Article number:
pgag076
Publication date:
2026-04-14
Acceptance date:
2025-12-22
DOI:
EISSN:
2752-6542
ISSN:
2752-6542


Language:
English
Keywords:
Pubs id:
2420650
Local pid:
pubs:2420650
Source identifiers:
3948194
Deposit date:
2026-04-21
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP