Conference item icon

Conference item

Safely interruptible agents

Abstract:
Reinforcement learning agents interacting with a complex environment like the real world are unlikely to behave optimally all the time. If such an agent is operating in real-time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions—harmful either for the agent or for the environment—and lead the agent into a safer situation. However, if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button— which is an undesirable outcome. This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa. We show that even ideal, uncomputable reinforcement learning agents for (deterministic) general computable environments can be made safely interruptible.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Publication website:
https://auai.org/uai2016/proceedings.php

Authors


More by this author
Institution:
University of Oxford
Division:
HUMS
Department:
Philosophy
Research group:
The Future of Humanity Institute
Role:
Author


Publisher:
AUAI Press
Host title:
Uncertainty In Artificial Intelligence Proceedings of the Thirty-Second Conference (2016)
Pages:
557-566
Publication date:
2016-06-25
Acceptance date:
2016-05-06
Event title:
32nd Conference on Uncertainty in Artificial Intelligence (UAI 2016)
Event location:
Jersey City, New Jersey, USA
Event website:
http://auai.org/uai2016/index.php
Event start date:
2016-06-25
Event end date:
2016-06-29
ISBN:
978-0-9966431-1-5


Language:
English
Pubs id:
pubs:624187
UUID:
uuid:17c0e095-4e13-47fc-bace-64ec46134a3f
Local pid:
pubs:624187
Source identifiers:
624187
Deposit date:
2016-05-26

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP