Safe learning in humans and machines

Mahajan, P

Collections:

Thesis

Safe learning in humans and machines

Abstract:: Intelligent agents, biological or artificial, face a fundamental dilemma: how to learn safely from experience when learning inevitably involves making mistakes. This requires the ability to explore environments with caution during learning (i.e., safe exploration) and to infer deviations from homeostatic grace, such as injury in animals or faults in robots, to reorganise behaviour appropriately (i.e., self-preservation).

This work first explores safe exploration through strategies that combine multiple value functions. A critical safety-efficiency trade-off is identified, arising from the conflict between instrumental control, which learns the consequences of actions, and learned defensive reflexes such as Pavlovian biases to withdraw from aversive stimuli. It is hypothesised that this trade-off can be resolved by gating Pavlovian avoidance based on outcome uncertainty, and a basic test is provided in a human approach–withdrawal virtual reality experiment. Noting the suboptimality underlying Pavlovian misbehaviour, the thesis subsequently proposes a mechanism by which the dopaminergic system could optimally compose multiple values to support efficient, safe, and stable learning.

Shifting focus from external threats to bodily integrity, the thesis next addresses the problem of self-preservation, with particular emphasis on the computational representation of injury. Post-injury homeostasis is modelled as a partially observable Markov decision process (POMDP), explaining counterintuitive behaviours such as investigating an injury despite immediate pain. This framework is used to mathematically formalise an information-restriction model of pain chronification, providing a quantitative complement to the Fear-Avoidance model. These concepts are then extended to machines: robots performing stereotypical movements can employ self-supervised learning and local learning rules to build internal models of expected sensorimotor experience, enabling fault detection and adaptive responses to unexpected deviations.

Together, this work advances the understanding of safe learning, a challenge shared by humans and machines, with implications for understanding post-injury transitions to chronic pain and the development of neuro-inspired safe AI.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Mahajan, P. (2025). Safe learning in humans and machines [PhD thesis]. University of Oxford.

MLA Style

Mahajan, P. Safe Learning in Humans and Machines. 2025. University of Oxford, PhD thesis.

Chicago Style

Mahajan, P. 2025. “Safe Learning in Humans and Machines.” PhD thesis, University of Oxford.
Print