Will AI avoid exploitation? Artificial general intelligence and expected utility theory

Journal article

Abstract:: A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.

Files:: Bales_2025_Will_AI_avoid.pdf

(Preview, Version of record, pdf, 894.5KB, Terms of use)

Language:: English
Keywords:: money pump arguments

artificial general intelligence

expected utility theory

FFR

artificial intelligence
Pubs id:: 1493167
Local pid:: pubs:1493167
Deposit date:: 2023-07-18

Copyright holder:: Adam Bales
Rights statement:: © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

If you are the owner of this record, you can report an update to it here: Report update to this record