Thesis icon

Thesis

A computational model of habits in the brain

Abstract:
The neural mechanism of action-selection is often modelled as a combination of goal-directed learning and habitual movements. Impairments in the function and interplay of these processes are associated with the aetiology of many neural pathologies. Until recently, a fundamental discrepancy existed in the definition of habits between experimental and computational models.

Experimentally, habits are defined as the reward-independent, stimulus-response relationships which form when an action is regularly executed in the same context, regardless of outcome. In contrast, computational models largely represent habits through model-free algorithms reliant on reward prediction errors.

This thesis expands upon proposals by Miller et al. (2019) and Bogacz (2020) which resolve this inconsistency through dopaminergic action prediction errors, rather than reward. Specifically, we extend their work to test for the existence of such signals in neural and behavioural data. To this aim, two novel computational models are developed and then compared in their ability to replicate the associated data against the gold-standard reward prediction error equivalents.

This thesis first focusses on dopaminergic signals across continuous time. We begin by outlining our 'temporal-difference action learning' algorithm, which uses biologically-plausible mechanisms to determine how dynamic changes in action intensity influence the resultant prediction errors across near-continuous time. We then demonstrate that dopaminergic data collected by Greenstreet et al. (2025) from the tail of the striatum is better represented by action prediction errors than reward.

The later chapters explore the detection of value-free habits under time-constrained conditions in human behavioural data from Hardwick et al. (2019). This required the creation of our 'two-drift race diffusion model' and corresponding analytical solutions since, to our knowledge, no algorithms previously existed that included mid-trial drift changes in multi-alternate forced choice paradigms. We finally establish that most participants' behaviour was best described by stimulus-response relationships which evolved from action prediction errors.

Overall, our results support the existence of value-free action prediction errors and associated habitual behaviour in dopaminergic signals and human behavioural data. This thesis also provides a proof-of-concept application of our novel models and opens the avenue for future research to test their predictions more directly using data specifically collected for this purpose.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Clinical Neurosciences
Oxford college:
St Anne's College
Role:
Author

Contributors

Institution:
University of Oxford
Division:
MSD
Department:
Clinical Neurosciences
Role:
Supervisor


More from this funder
Funder identifier:
https://ror.org/03x94j517
Grant:
HMR03260_HM09.05
Programme:
Medical Research Council (MRC) - 3 Year Research Studentship (BNDU DTA)


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP