Conference item
Sample efficient model-free reinforcement learning from LTL specifications with optimality guarantees
- Abstract:
- Linear Temporal Logic (LTL) is widely used to specify high-level objectives for system policies, and it is highly desirable for autonomous systems to learn the optimal policy with respect to such specifications. However, learning the optimal policy from LTL specifications is not trivial. We present a model-free Reinforcement Learning (RL) approach that efficiently learns an optimal policy for an unknown stochastic system, modelled using Markov Decision Processes (MDPs). We propose a novel and more general product MDP, reward structure and discounting mechanism that, when applied in conjunction with off-the-shelf model-free RL algorithms, efficiently learn the optimal policy that maximizes the probability of satisfying a given LTL specification with optimality guarantees. We also provide improved theoretical results on choosing the key parameters in RL to ensure optimality. To directly evaluate the learned policy, we adopt probabilistic model checker PRISM to compute the probability of the policy satisfying such specifications. Several experiments on various tabular MDP environments across different LTL tasks demonstrate the improved sample efficiency and optimal policy convergence.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 1.2MB, Terms of use)
-
- Publisher copy:
- 10.24963/ijcai.2023/465
Authors
- Publisher:
- IJCAI
- Pages:
- 4180-4189
- Publication date:
- 2023-08-11
- Acceptance date:
- 2023-04-19
- Event title:
- 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)
- Event location:
- Macao, China
- Event website:
- https://ijcai-23.org/
- Event start date:
- 2023-08-19
- Event end date:
- 2023-08-25
- DOI:
- Language:
-
English
- Keywords:
- Pubs id:
-
1341400
- Local pid:
-
pubs:1341400
- Deposit date:
-
2023-05-17
Terms of use
- Copyright date:
- 2023
- Notes:
- This paper will be presented at the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 19th-25th August 2023, Macau, S.A.R. This is the accepted manuscript version of the article. The final version is available online from IJCAI at: https://doi.org/10.24963/ijcai.2023/465
If you are the owner of this record, you can report an update to it here: Report update to this record