Journal article icon

Journal article

Model-based learning retrospectively updates model-free values

Abstract:
The predictive mind theory proposes that brains work in a way that makes predictions about future stimuli to process information efficiently and accurately. Bayesian brain theory suggests that the brain utilizes Bayesian probability models to make predictions, while the free-energy minimization hypothesis proposes that these predictions are made to minimize energy or uncertainty, ensuring accurate perceptions. Vertechi et al. (2020) explored animal participants’ utilization of stimulus-bound strategy versus inference-based strategy to solve a Markov decision process with a 2-state environment, one of which is always active. These sites have a certain probability of switching to a different site and the inverse probability of staying in the same site for the next guess or iteration. This setup served as the basis for my experiment, where I employed three types of model-free artificial neural networks in the 2-state MDP environment: Deep Q-learning, Proximal Policy Optimization, and Recurrent PPO with long-short term memory architecture. Each agent was tested in three environments with varying probabilities of active site switching and reward allocation. The data showed that all but one ANN in the medium environment failed to learn with an accuracy above the expected rate limited to 1-back memory. In the medium and difficult environments, the DQN was the best performer, followed closely by the RPPO. Across past studies, the DQN was outperformed by the PPO agents, which is inconsistent with our findings. However, our findings are consistent with Vertechi et al.’s (2020) prediction that a model-free and stimulus-bound agent would get worse at learning the environment depending on the frequency at which the rewards were given. These findings also show that animals must have at least a mixture of model-free and model-based processing involved when problem solving and doing other cognitive tasks
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Authors

More by this author
Institution:
University of Oxford
Role:
Author
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-7361-9467
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0003-0735-4349


Publisher:
Nature Research
Journal:
Scientific Reports More from this journal
Volume:
12
Issue:
1
Pages:
2358-2358
Article number:
2358
Publication date:
2022-02-11
DOI:
EISSN:
2045-2322
ISSN:
2045-2322


Language:
English
Keywords:
Pubs id:
1240430
Local pid:
pubs:1240430
Source identifiers:
W4211216616
Deposit date:
2026-04-09
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP