Conference item icon

Conference item

Reinforcement learning for temporal logic control synthesis with probabilistic satisfaction guarantees

Abstract:
We present a model-free reinforcement learning algorithm to synthesize control policies that maximize the probability of satisfying high-level control objectives given as Linear Temporal Logic (LTL) formulas. Uncertainty is considered in the workspace properties, the structure of the workspace, and the agent actions, giving rise to a Probabilistically-Labeled Markov Decision Process (PL-MDP) with unknown graph structure and stochastic behaviour, which is even more general than a fully unknown MDP. We first translate the LTL specification into a Limit Deterministic Büchi Automaton (LDBA), which is then used in an on-the-fly product with the PL-MDP. Thereafter, we define a synchronous reward function based on the acceptance condition of the LDBA. Finally, we show that the RL algorithm delivers a policy that maximizes the satisfaction probability asymptotically. We provide experimental results that showcase the efficiency of the proposed method.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Author
ORCID:
0000-0002-6681-5283


Publisher:
IEEE
Host title:
2019 IEEE 58th Conference on Decision and Control (CDC)
Pages:
5338-5343
Publication date:
2020-03-12
Acceptance date:
2019-07-19
Event title:
Conference on Decision and Control (CDC
Event location:
Nice, France
Event website:
https://cdc2019.ieeecss.org
Event start date:
2019-12-11
Event end date:
2019-12-13
DOI:
EISSN:
2576-2370
EISBN:
978-1-7281-1398-2


Language:
English
Keywords:
Pubs id:
pubs:1053310
UUID:
uuid:53301059-d6f2-49fe-9dd1-9b1b9aac944e
Local pid:
pubs:1053310
Source identifiers:
1053310
Deposit date:
2019-09-13

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP