Stop! planner time: metareasoning for probabilistic planning using learned performance profiles

Conference item

Abstract:: The metareasoning framework aims to enable autonomous agents to factor in planning costs when making decisions. In this work, we develop the first non-myopic metareasoning algorithm for planning with Markov decision processes. Our method learns the behaviour of anytime probabilistic planning algorithms from performance data. Specifically, we propose a novel model for metareasoning, based on contextual performance profiles that predict the value of the planner’s current solution given the time spent planning, the state of the planning algorithm’s internal parameters, and the difficulty of the planning problem being solved. This model removes the need to assume that the current solution quality is always known, broadening the class of metareasoning problems that can be addressed. We then employ deep reinforcement learning to learn a policy that decides, at each timestep, whether to continue planning or start executing the current plan, and how to set hyperparameters of the planner to enhance its performance. We demonstrate our algorithm’s ability to perform effective metareasoning in two domains.

Files:: Hawes_et_al_2023_Stop_planner_time.pdf

(Preview, Accepted manuscript, pdf, 728.8KB, Terms of use)

Publisher:: Association for the Advancement of Artificial Intelligence
Host title:: Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
Volume:: 38
Issue:: 18
Pages:: 20053-20060
Publication date:: 2024-03-24
Acceptance date:: 2023-12-09
Event title:: 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
Event location:: Vancouver, Canada
Event website:: https://aaai.org/aaai-conference/
Event start date:: 2024-02-20
Event end date:: 2024-02-27
DOI:: 10.1609/aaai.v38i18.29983

Language:: English
Keywords:: RU: Sequential Decision Making

POMDPs)

SO: Metareasoning and Metaheuristics

PRS: Planning under Uncertainty

PRS: Learning for Planning and Scheduling

PRS: Planning with Markov Models (MDPs
Pubs id:: 1653642
Local pid:: pubs:1653642
Deposit date:: 2024-02-28

Copyright holder:: Association for the Advancement of Artifcial Intelligence
Notes:: This paper was presented at the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024), 20th-27th February 2024, Vancouver, Canada. This is the accepted manuscript version of the article. The final version is available online from Association for the Advancement of Artificial Intelligence at https://dx.doi.org/10.1609/aaai.v38i18.29983

Licence:: Terms and Conditions of Use for Oxford University Research Archive

If you are the owner of this record, you can report an update to it here: Report update to this record