Conference item
Walking the values in bayesian inverse reinforcement learning
- Abstract:
- The goal of Bayesian inverse reinforcement learning (IRL) is recovering a posterior distribution over reward functions using a set of demonstrations from an expert optimizing for a reward unknown to the learner. The resulting posterior over rewards can then be used to synthesize an apprentice policy that performs well on the same or a similar task. A key challenge in Bayesian IRL is bridging the computational gap between the hypothesis space of possible rewards and the likelihood, often defined in terms of Q values: vanilla Bayesian IRL needs to solve the costly forward planning problem - going from rewards to the Q values - at every step of the algorithm, which may need to be done thousands of times. We propose to solve this by a simple change: instead of focusing on primarily sampling in the space of rewards, we can focus on primarily working in the space of Q-values, since the computation required to go from Q-values to reward is radically cheaper. Furthermore, this reversion of the computation makes it easy to compute the gradient allowing efficient sampling using Hamiltonian Monte Carlo. We propose ValueWalk - a new Markov chain Monte Carlo method based on this insight - and illustrate its advantages on several tasks.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 3.0MB, Terms of use)
-
- Publication website:
- https://proceedings.mlr.press/v244/bajgar24a.html
Authors
- Publisher:
- PMLR
- Host title:
- Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
- Volume:
- 244
- Pages:
- 273-287
- Series:
- Proceedings of Machine Learning Research
- Publication date:
- 2024-07-15
- Acceptance date:
- 2024-04-15
- Event title:
- 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)
- Event location:
- Barcelona, Spain
- Event website:
- https://www.auai.org/uai2024/
- Event start date:
- 2025-07-15
- Event end date:
- 2025-07-19
- EISSN:
-
2640-3498
- ISSN:
-
1525-3384
- Language:
-
English
- Pubs id:
-
2073248
- Local pid:
-
pubs:2073248
- Deposit date:
-
2025-02-25
- ARK identifier:
Terms of use
- Copyright holder:
- Bajgar et al
- Copyright date:
- 2024
- Rights statement:
- © 2024 by the author(s). This is an open access article under the CC-BY license.
- Notes:
- This paper was presented at the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024), 14th-19th July 2024, Barcelona, Spain.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record