Conference item icon

Conference item

Walking the values in bayesian inverse reinforcement learning

Abstract:
The goal of Bayesian inverse reinforcement learning (IRL) is recovering a posterior distribution over reward functions using a set of demonstrations from an expert optimizing for a reward unknown to the learner. The resulting posterior over rewards can then be used to synthesize an apprentice policy that performs well on the same or a similar task. A key challenge in Bayesian IRL is bridging the computational gap between the hypothesis space of possible rewards and the likelihood, often defined in terms of Q values: vanilla Bayesian IRL needs to solve the costly forward planning problem - going from rewards to the Q values - at every step of the algorithm, which may need to be done thousands of times. We propose to solve this by a simple change: instead of focusing on primarily sampling in the space of rewards, we can focus on primarily working in the space of Q-values, since the computation required to go from Q-values to reward is radically cheaper. Furthermore, this reversion of the computation makes it easy to compute the gradient allowing efficient sampling using Hamiltonian Monte Carlo. We propose ValueWalk - a new Markov chain Monte Carlo method based on this insight - and illustrate its advantages on several tasks.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publication website:
https://proceedings.mlr.press/v244/bajgar24a.html

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Lady Margaret Hall
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Oxford college:
Linacre College
Role:
Author
ORCID:
0000-0002-5627-9093
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Exeter College
Role:
Author
ORCID:
0000-0003-1959-012X


Publisher:
PMLR
Host title:
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
Volume:
244
Pages:
273-287
Series:
Proceedings of Machine Learning Research
Publication date:
2024-07-15
Acceptance date:
2024-04-15
Event title:
40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)
Event location:
Barcelona, Spain
Event website:
https://www.auai.org/uai2024/
Event start date:
2025-07-15
Event end date:
2025-07-19
EISSN:
2640-3498
ISSN:
1525-3384


Language:
English
Pubs id:
2073248
Local pid:
pubs:2073248
Deposit date:
2025-02-25
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP